Fleeting Years Time goes.

Electron as GUI of Python Applications

what

Electron (formerly Atom Shell) is a desktop node.js-powered "shell". It is designed by Github and used to build Atom Editor.

Python is a simple and powerful programming language.

This post is a note about how to use Electron as the desktop GUI for Python applications.

but why?

Building desktop applications with Python is not easy.

Tkinter is the standard package for Python GUI, but it is ~~very~~ ugly.

The only mature and real-world solution is QT, with Python package PySide or PyQT, and Enaml based on that. However, PySide seems to have died, and PyQT is not free for commercial usages.

IPython, an enhanced shell for Python, has an interesting design: it has kernal, a qtconsole powered by QT, and notebook powered by web pages.

So I was thinking, why not use Electron as the "GUI shell" for the Python applications by embedding web pages? It is free, and hopefully elegant.

the architecture

The basic idea is rather simple:

The first way: Electron as the "launcher and minimal web browser", loading the web pages dynamically generated by Python, where behind the web pages Python does all the heavy lifting.

The second way: Electron as the "launcher and minimal web browser", loading the web pages statically written (the static files index.html, etc), where these pages communicate with Python by restful api or something like zeromq.

The first way is easy to understand and implemented, while the second way seems to provide more protentials.

After that, we could use PyInstaller to package the Python files, then use the built-in method of Electron to package all the HTML, CSS, Javascript files and Python binaries together. In the end we are able to distribute the generated binary files. Although we should notice that it may be easy to extract the souce codes in the distributed files.

a complete example

This is the example modified from the "hello world" of Electron, implementing the first way mentioned above. Nothing magic. The key point is to create a child process to run the python script and load the "home page" generated.

Install Python, node.js, then

pip install Flask
npm install electron-prebuilt -g
npm install jquery -g

Then create a working directory. cd to the directory.

We need a basic package.json:

{
  "name"    : "your-app",
  "version" : "0.1.0",
  "main"    : "main.js",
  "dependencies": {"electron-prebuilt":"", "jquery":""}
}

as well as the main.js:

var app = require('app');
var BrowserWindow = require('browser-window'); 
require('crash-reporter').start();
var mainWindow = null;
app.on('window-all-closed', function() {
  app.quit();
});

app.on('ready', function() {
  // call python
  var subpy = require('child_process').spawn('python', [__dirname + '/hello.py']);

  // Create the browser window.
  mainWindow = new BrowserWindow({width: 800, height: 600});

  // and load the index.html of the app.
  mainWindow.loadUrl('file://' + __dirname + '/index.html');
  //mainWindow.loadUrl('http://localhost:5000');

  // Open the devtools.
  mainWindow.openDevTools();

  // Emitted when the window is closed.
  mainWindow.on('closed', function() {
    mainWindow = null;

    // kill python
    subpy.kill('SIGINT');
  });
});

Notice that in main.js, we spawn a child process for a Python application. But we load index.html firstly. Why? Because it needs time to start the web server, so we could start the static file firstly, then let the static file detect whether the web server is ready or not. If the web server is ready, we redirect the pages.

Here is index.html:

<!DOCTYPE html>
<html>
  <head><title>Hello World!</title></head>
  <body>
    <h1>Hello World!</h1>
    Loading...
  </body>

  <script type="text/javascript">
    window.$ = window.jQuery = require('jquery');
    var mainAddr = 'http://localhost:5000';
    $.ajax({
        type: 'HEAD',
        url: mainAddr,
        success: function() {
            //alert('good');
            location.replace(mainAddr);
        },
        error: function() {
            alert('some errors happen...');
            // page does not exist
        }
    });
  </script>
</html>

Notice the tricky way of loading jquery. We are runing the scripts in a node-like environment instead of tranditional web browser. So we need to run npm install before that and then use require().

Lastly, the hello.py:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import print_function
import sys, time
from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    #time.sleep(5)
    return "Hello World! This is powered by Python backend."

if __name__ == "__main__":
    print('oh hello')
    sys.stdout.flush()
    app.run(host='127.0.0.1', port=5000)

After all the files are generated, we could simply run Electron inside bash:

electron . # . as the working directory

A desktop application should be launched as desired.

further thinking

Electron is cool. But according to the issues in Atom Editor, the performance one of the main issue.

"Everything is a website" is also cool. But well, we may ~~easily~~ reach the limitations of web technologies.

That said, I believe "Electron as GUI for Python applications" is still an interesting approach about writing GUI in Python.

some LaTeX notes

Here are some notes for LaTeX. All from a guy's blog: 始终.

LaTeX 的中文

曾经 LaTeX 的中文处理非常繁琐,现在日新月异,LaTeX 的中文处理在一般情况下已经很便捷,再也不需要折腾。

要求:安装最新的 TeX Live 2015 / MacTeX 2015 (重点是需要其中的 CTeX 2.0 宏包)。使用 XeLaTeX。

如果需要中文版式:

%!TEX program = xelatex
\documentclass[UTF8]{ctexart}
\begin{document}
这个文档有中文版式和自动的字体配置。
\end{document}

如果只是插入一部分中文,不需要中文版式:

%!TEX program = xelatex
\documentclass{article}
\usepackage[UTF8, heading = false, scheme = plain]{ctex}
\begin{document}
This article contains some 中文文字.
\end{document}

全部都使用 xelatex 来进行编译。

LaTeX and Sublime Text

Install Sublime Text, TeX Live 2015 / MacTeX 2015.

Install SumatraPDF on Windows, Skim on OSX, or Evince on Linux.

In Sublime, install Package Control then install LaTeXTools. After Installation, we have to run command "Reconfigure LaTeXTools and migrate settings" before the first time use.

Then use Command + B or Ctrl + B inside Sublime to compile the LaTeX files. If we need Chinese characters, we could add %!TEX program = xelatex as the first line of the source codes.

References

All notes are copied and/or built upon these links, which are distributed under CC BY-SA 3.0:

Alternative to Mathematica

Mathematica is a great solution to some computing problems. However it has only one disadvantage: it is very expensive. Any other way to avoid the fee? No, I am not talking about swithching to R, Python, etc.

Firstly, consult the university if being a university student. I found that many universities in United States provide Mathematica to their staffs and students, allowing them install it on their personal computer. I am not sure what will happen after the student graduate. But at least students could buy the product cheaper after graduation.

Secondly, use WolframAlpha, an official "free alternative" by Wolfram, the company behind Mathematica. It supports natural language input and some not-complex Mathematica commands. It is really helpful! One more thing, Wolfram provides a free seperate Integral Calculator!

Thirdly, use Mathics, a free online interface "simulating" Mathematica, which is supported by Python. It is even open source! It is not fast, but yet helpful, and it support some "standard" commands from Mathematica, unlike WolframAlpha. I always combine it with WolframAlpha.

Lastly, after some researching, I believe sometimes time is money. In fact, Wolfram provides WolframAlpha Pro and Mathematica Online, which are very portable and useful especially when urgent. Its monthly subscription model and online technology guarantee that we can pay for and use it when needed, and avoid the fee other time.

If you read this post, hope my post is helpful.

my English experience

I began learning English at age of 10 in primary school. However, it is in 2005 that I fell in love with English and realized its importance in my life.

In 2005, the sixth book of Harry Potter was released. As a big fan of the Harry Potter Series, I could not wait to read the new book. However, the Chinese edition did not come out at that time. Coincidently, my relatives came back from UK with a Harry Potter in its original as a gift to me. With enormous passion and an English dictionary on hand, I launched an arduous journey of reading the English novel, trying to know what happened to Harry Potter. From then on, English opened a door to a new world for me. After I entered senior high school, I started reading popular English novels such as Sherlock Holmes series and The Lord of the Rings. In addition, I also read English books about science and technology including U.S. textbooks of mathematics, physics, and chemistry with concepts more clearly explained in order to better prepare for the National College Entrance Examination. Thanks to the experiences in high school, I felt comfortable to read English literatures and papers of my major at university.

Besides reading, I wrote. In 2009, the same year when I started to read English textbooks, I met wordpress.com, a free online blogging community. On this platform, I recorded my daily life, wrote book reviews and posted my thoughts about news and hot issues. The experience of writing English blogs not only improved my writing skills, but also greatly reduced my fear of writing in English. It is significant to get rid of the fear for English writing for any Chinese student. For fear of writing, my teammates in research projects always elected me to finish the hard part of our projects: writing papers in English. Admittedly, it was hard at the first time, yet I still struggled to finish it before the deadline by learning the standard format of English papers and the terminologies in English in the paper. Besides writing papers, I made some money by doing translation online, which often required professional knowledge in a specific field and fair searching skills on internet. These experiences have helped me improve my writing skills all the way.

Thanks to U.S. TV series and movies, my English listening and speaking comprehensions are not bad. In 2010, U.S. TV series started to spread over China Internet. Among them, I love Friends and How I Met Your Mother the best. By watching them without Chinese subtitles, I trained my listening and enriched my knowledge about daily communication in English. I was very proud when I could speak fluent English in English Corner activities at university and discuss the most recent TV series in English with my friends who also loved watching them. Recently, I was attracted to TED talks and Scientific American: 60-Second Science on the Internet, which required higher in listening comprehension to absorb new ideas and technologies in the world. Finally yet importantly, I attended some English lectures given by foreign professors in our university and in the city, in which I often talked with them to practice my English.

Hopefully my English skill continue to get improved in the future.

Dual Thrust Trading

Dual Thrust is a very simple but seemly effective strategy in quantitative investment.

Attention: I am NOT responsible for ANY of your loss!

strategy

  1. After the close of first day, let m = max(FirstDayHighestPrice-FirstDayClosePrice, FirstDayClosePrice-FirstDayLowestPrice), then let SecondDayTrigger1 = m * k1, SecondDayTrigger2 = m * k2. SencondDayTrigger1 and SecondDayTrigger2 are called trigger values.

  2. In the second day, note down the SecondDayOpenPrice. Once the price is higher than SecondDayOpenPrice + SecondDayTrigger1, buy. And once the price is lower than SecondDayOpenPrice - SecondDayTrigger2, sell short.

  3. This system is a reversal system. Say, Once the price is higher than SecondDayOpenPrice + SecondDayTrigger1, and buy two shares if having a short shares. And once the price is lower than SecondDayOpenPrice - SecondDayTrigger2, short sell two shares if having a long share. (TODO: precise translation in English. 如果在价格超过(开盘+触发值1)时手头有一手空单,则买入两手。如果在价格低于(开盘-触发值2)时手上有一手多单,则卖出两手。)

keypoints

This strategy is a super-easy one. It's possible to build an automated trading system to do all the jobs. But of course there are some risks. For example, how to choose k1 and k2 and how they influence the result are not clear for me. Moreover, sometimes the stock runs vibrately, then the strategy will cause loss in the unexpected way. Last but not least, no stop-loss order is included in this strategy. It's guessed (by me) that reducing k2 may stop loss to some extend.

simulation

If you want to reproduce the result or do some further research, you can download the min1_sh000300.csv and some other data from this page .

And possible-updated code for this project is on Github . Of course forks and pull requests are welcome.

I choose the Shanghai Shenzhen CSI 300 Index (000300.SS) to run the simulation. I acquired min1_sh000300.csv, the high frequency (every-minute-level) index price of CSI300 from 2010-01-01 to 2013-11-30, with some days lost.

There are some assumptions and limitations in the simulation. I assume I have 1,000,000 (one million) yuan cash available (WOW), and I cannot borrow shares more than those valued as 50% of one million. And I started to apply the strategy from 2010-01-01 to 2013-11-11. No market impact, no transaction cost.

library

All the below code requires these three libraries. Add these libraries accordingly firstly if you meet any troubles running code in this passage.

library("lubridate")
library("ggplot2")
library("zoo")
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

load the data

a = read.csv("min1_sh000300.csv")
head(a)
##   Stock Code                Time Open High  Low Close  Volume    Amount
## 1    sh  300 2010-01-04 09:31:00 3592 3597 3592  3596 1160765 1.555e+09
## 2    sh  300 2010-01-04 09:32:00 3595 3596 3592  3592  539317 7.652e+08
## 3    sh  300 2010-01-04 09:33:00 3593 3593 3589  3589  434880 6.097e+08
## 4    sh  300 2010-01-04 09:34:00 3588 3588 3586  3586  392227 5.727e+08
## 5    sh  300 2010-01-04 09:35:00 3587 3587 3586  3586  370230 5.236e+08
## 6    sh  300 2010-01-04 09:36:00 3587 3587 3586  3586  367130 5.186e+08

minutedata = read.zoo(a[, 3:9], FUN = ymd_hms)
head(minutedata)
##                     Open High  Low Close  Volume    Amount
## 2010-01-04 09:31:00 3592 3597 3592  3596 1160765 1.555e+09
## 2010-01-04 09:32:00 3595 3596 3592  3592  539317 7.652e+08
## 2010-01-04 09:33:00 3593 3593 3589  3589  434880 6.097e+08
## 2010-01-04 09:34:00 3588 3588 3586  3586  392227 5.727e+08
## 2010-01-04 09:35:00 3587 3587 3586  3586  370230 5.236e+08
## 2010-01-04 09:36:00 3587 3587 3586  3586  367130 5.186e+08

generate the daily data

It's quite strange that Google and Yahoo! do not provide the precise daily data of CSI300. So I have to generate the daily (low-frequency) data from the minutes data!

gendaydata <- function(minutedata){
    alldaysdata = data.frame(Date=NULL, Open=NULL, High=NULL, Low=NULL, Close=NULL)

    tmphigh = NULL
    tmplow = NULL
    tmpopen = NULL
    tmpclose = NULL

    for(i in 1:(nrow(minutedata)-1)){
        print(i)

        if(as.Date(index(minutedata)[i])==as.Date(index(minutedata)[i+1])){
            tmpopen = c(tmpopen, minutedata[i]$Open)
            tmphigh = c(tmphigh, minutedata[i]$High)
            tmplow = c(tmplow, minutedata[i]$Low)
        }

        if(as.Date(index(minutedata)[i])!=as.Date(index(minutedata)[i+1])){
            tmphigh = c(tmphigh, minutedata[i]$High)
            tmplow = c(tmplow, minutedata[i]$Low)
            tmpclose = minutedata[i]$Close

            dayhigh = max(tmphigh)
            daylow = min(tmplow)
            dayopen = tmpopen[1]
            dayclose = tmpclose[1]
            daydate = as.character(index(minutedata)[i])
            singledaydata = data.frame(Date=daydate, Open=dayopen, High=dayhigh, Low=daylow, Close=dayclose)
            alldaysdata = rbind(alldaysdata, singledaydata)

            tmphigh = NULL
            tmplow = NULL
            tmpopen = NULL
            tmpclose = NULL
        }

        if(as.Date(index(minutedata)[i])==as.Date(index(minutedata)[i+1]) && i+1==nrow(minutedata)){
            #tmpopen = c(tmpopen, minutedata[i]$Open)  # not needed
            #tmphigh = c(tmphigh, minutedata[i]$High)  # not needed
            #tmplow = c(tmplow, minutedata[i]$Low)  # not needed
            tmpclose = minutedata[i+1]$Close  #tmpclose = minutedata[i]$Close  # changed!!

            dayhigh = max(tmphigh)
            daylow = min(tmplow)
            dayopen = tmpopen[1]
            dayclose = tmpclose[1]
            daydate = as.character(index(minutedata)[i])
            singledaydata = data.frame(Date=daydate, Open=dayopen, High=dayhigh, Low=daylow, Close=dayclose)
            alldaysdata = rbind(alldaysdata, singledaydata)

            tmphigh = NULL
            tmplow = NULL
            tmpopen = NULL
            tmpclose = NULL
        }
    }

    return(alldaysdata)
}

Then I do this:

daydata = gendaydata(minutedata)
# requires a long long time!!
daydata = as.zoo(daydata[,2:5], as.Date(daydata[,1]))
# turn it into a zoo object
head(daydata)
##            Open High  Low Close
## 2010-01-04 3592 3597 3535  3535
## 2010-01-05 3545 3577 3498  3564
## 2010-01-06 3559 3589 3541  3542
## 2010-01-07 3543 3559 3453  3471
## 2010-01-08 3457 3482 3427  3480
## 2010-01-11 3593 3594 3466  3482

run!

Two situations are worth discussing: the first is that investors cannot sell short, and the second is that the investors can sell short.

For example, A-shares in China do not allow shorting. In other words, investors can "just sell all the shares they own", but they cannot "borrow extra shares and sell them, and then return the shares to the lenders next time they buy the shares". But when it comes to options stocks or funds stocks, they are allowed to sell short in China.

So at first I define this trading function, in which the investors cannot sell short:

starttradesimp <- function(minutedata, daydata, minutesinday=240, k1=0.5, k2=0.2, startmoney=1000000){
    daydata$hmc = daydata$High - daydata$Close
    daydata$cml = daydata$Close - daydata$Low
    daydata$maxhmccml = (daydata$hmc + daydata$cml + abs(daydata$hmc - daydata$cml)) / 2
    daydata$trigger1 = daydata$maxhmccml * k1
    daydata$trigger2 = daydata$maxhmccml * k2
    print(daydata)

    timevetor = c()
    cashvetor = c()
    stockassetvetor = c()
    allvetor = cashvetor + stockassetvetor

    cash = startmoney
    hands = 0
    stockasset = 0

    for(i in 2:nrow(daydata)){
        trigger1 = as.numeric(daydata$trigger1[i-1])
        trigger2 = as.numeric(daydata$trigger2[i-1])

        for(k in ((i-1)*minutesinday+1):(i*minutesinday)){
            # access this day's minute data
            if(as.numeric(minutedata[k]$Open) > (as.numeric(daydata[i]$Open)+trigger1)){
                # buy
                print('buyyyyyyyyyyyyy!')
                thishands = cash %/% as.numeric(minutedata[k]$Open)
                cash = cash - thishands * as.numeric(minutedata[k]$Open)
                hands = thishands + hands
                stockasset = hands * as.numeric(minutedata[k]$Open)
            } else if(as.numeric(minutedata[k]$Open) < (as.numeric(daydata[i]$Open)-trigger2)){
                # sell
                print('sellllllllllllll!')
                cash = cash + hands * as.numeric(minutedata[k]$Open)
                hands = 0
                stockasset = 0
            } else{
                stockasset = hands * as.numeric(minutedata[k]$Open)
            }

            timevetor = c(timevetor, index(minutedata)[k])
            cashvetor = c(cashvetor, cash)
            stockassetvetor = c(stockassetvetor, stockasset)
            allvetor = c(allvetor, cash+stockasset)
            print(paste('i:', i, ', k:', k, ', cash:', cash, ', stockasset:', stockasset, ', ',index(minutedata)[k] ))
        }
    }

    return(data.frame(time=as.POSIXct(timevetor, origin='1970-01-01', tz='UTC'), cash=cashvetor, stockasset=stockassetvetor, all=allvetor))
}

And the second function in which investors can sell short:

starttrade <- function(minutedata, daydata, minutesinday=240, k1=0.5, k2=0.2, startmoney=1000000, borrowed_rate = 0.5){
    daydata$hmc = daydata$High - daydata$Close
    daydata$cml = daydata$Close - daydata$Low
    daydata$maxhmccml = (daydata$hmc + daydata$cml + abs(daydata$hmc - daydata$cml)) / 2
    daydata$trigger1 = daydata$maxhmccml * k1
    daydata$trigger2 = daydata$maxhmccml * k2
    print(daydata)

    timevetor = c()
    cashvetor = c()
    stockassetvetor = c()
    allvetor = cashvetor + stockassetvetor

    cash = startmoney
    hands = 0
    stockasset = 0
    borrowed_money = startmoney * borrowed_rate
    borrowed_hands = 0
    has_borrowed = FALSE

    for(i in 2:nrow(daydata)){
        trigger1 = as.numeric(daydata$trigger1[i-1])
        trigger2 = as.numeric(daydata$trigger2[i-1])

        for(k in ((i-1)*minutesinday+1):(i*minutesinday)){
            # access this day's minute data
            if(as.numeric(minutedata[k]$Open) > (as.numeric(daydata[i]$Open)+trigger1)){
                # buy
                print('buyyyyyyyyyyyyy!')
                thishands = cash %/% as.numeric(minutedata[k]$Open)
                cash = cash - thishands * as.numeric(minutedata[k]$Open)
                hands = thishands + hands - borrowed_hands
                stockasset = hands * as.numeric(minutedata[k]$Open)
                borrowed_hands = 0
                has_borrowed = FALSE
            } else if(as.numeric(minutedata[k]$Open) < (as.numeric(daydata[i]$Open)-trigger2)){
                # sell
                print('sellllllllllllll!')
                if(!has_borrowed){
                    borrowed_hands_this_time = borrowed_money %/% as.numeric(minutedata[k]$Open)
                    has_borrowed = TRUE
                } else{
                    borrowed_hands_this_time = 0
                }
                borrowed_hands = borrowed_hands + borrowed_hands_this_time
                cash = cash + (borrowed_hands_this_time + hands) * as.numeric(minutedata[k]$Open)
                hands = 0
                stockasset = 0
            } else{
                stockasset = hands * as.numeric(minutedata[k]$Open)
            }

            #print(borrowed_hands*as.numeric(minutedata[k]$Open))
            #print(borrowed_hands)
            #print(cash)
            #print(cash-borrowed_hands*as.numeric(minutedata[k]$Open))
            #print(as.numeric(minutedata[k]$Open))
            realcash = cash-borrowed_hands*as.numeric(minutedata[k]$Open)
            timevetor = c(timevetor, index(minutedata)[k])
            cashvetor = c(cashvetor, realcash)
            stockassetvetor = c(stockassetvetor, stockasset)
            allvetor = c(allvetor, realcash+stockasset)
            print(paste('i: ', i, '  k: ', k, '  realcash: ', realcash, '  stockasset: ', stockasset, '  ',index(minutedata)[k] ))
        }
    }

    return(data.frame(time=as.POSIXct(timevetor, origin='1970-01-01', tz='UTC'), realcash=cashvetor, stockasset=stockassetvetor, all=allvetor))
}

(Eww, complex enough...)

Both functions above accept the minutedata and daydata (generated before, zoo objects), then pretend there is a smart investor who can observe the stock every minute. Once the prices reach the trigger values, the investor knows it's time to sell or buy, then manipulates his/her assets accordingly. However, sometimes it's time to sell, but the investor don't have any shares in market, so he/she does nothing. Similarly, he/she does nothing if he/she doesn't have enough cash even the "buy!" signal is sent. At last, the functions return the data.frame objects reflecting the assets of the investor in every minute.

Next step. You may have to wait a night for these lines of code:

gen_trade_simp_result = starttradesimp(minutedata, daydata)
# verrrrrrrryyyyyyyyy slooooooooooooooowwwwwwwww!
gen_trade_result = starttrade(minutedata, daydata)
# verrrrrrrryyyyyyyyy slooooooooooooooowwwwwwwww!

result

head(gen_trade_simp_result)
##                  time  cash stockasset   all
## 1 2010-01-05 09:31:00 1e+06          0 1e+06
## 2 2010-01-05 09:32:00 1e+06          0 1e+06
## 3 2010-01-05 09:33:00 1e+06          0 1e+06
## 4 2010-01-05 09:34:00 1e+06          0 1e+06
## 5 2010-01-05 09:35:00 1e+06          0 1e+06
## 6 2010-01-05 09:36:00 1e+06          0 1e+06
head(gen_trade_result)
##                  time realcash stockasset   all
## 1 2010-01-05 09:31:00    1e+06          0 1e+06
## 2 2010-01-05 09:32:00    1e+06          0 1e+06
## 3 2010-01-05 09:33:00    1e+06          0 1e+06
## 4 2010-01-05 09:34:00    1e+06          0 1e+06
## 5 2010-01-05 09:35:00    1e+06          0 1e+06
## 6 2010-01-05 09:36:00    1e+06          0 1e+06

Well, you probably know the structure of the result data.frames now.

Why not have a plot?

qplot(x = gen_trade_simp_result$time, y = gen_trade_simp_result$all)

trade-simple-result

qplot(x = gen_trade_result$time, y = gen_trade_result$all)

trade-simple-result

= = /// The results are very promising!!!!!!!!!!! Please analyze the pictures by yourselves, and check whether there is any error in my functions.

end

The results of simulations show that Dual Thrust is a very magic and efficient strategy in stock market. So... really? Of course not. First, I do not consider market impact and transaction cost. Second, I ignore the possibility of margin call. Third, you have to apply the strategy for a relatively long time. And so many factors influence the market, no strategy ensures profits.

One more thing: Huatai Securities releases two detailed and professonial reports (first one & second one) about Dual Thrust in Chinese.

Attention again: I am NOT responsible for ANY of your loss!

极简 R 包建立方法

前言

最近想试一下捣腾一个 R 包出来,故参考了一些教程。现在看到的最好的就是谢益辉大大之前写过的开发R程序包之忍者篇,以及 Hadley 大神(ggplot2 devtools 等一系列包的作者)的 教程。但是前者有一些过时,后者是全英文的,所以我这里记录一下比较简单的过程,给读者们一个参考思路。如果你有一些 R 程序,想塞到去一个自创的 R 包中,那么这篇文章就可能是你想要的。为了方便说明,这里用我的包来进行示例。

准备工作

  1. 安装好 R。
  2. 可能需要 RStudio,没有的话也没有影响。
  3. 如果你是 Windows 下,请安装 rtools,去官网下载 exe 安装;如果你是 Linux 下,请安装对应的 R 开发包,Debian/Ubuntu 下就是运行命令 sudo apt-get install r-base-dev;如果你是 OS X 下,要装好 command-line-tools,如果你没有装过的话,Terminal 运行 git 或者 xcode-select 应该会弹出安装提示,按提示安装即可。
  4. 打开 R 环境,运行 install.packages('devtools',dependencies=T)
  5. 有一个编辑代码的程序,比如说 Sublime Text,notepad++,请不要用记事本编辑代码!另外要记得把文件保存成 UTF-8 (without BOM) 编码!
  6. 你要有一堆已经正确运行成功的 R function(s),你想把它们塞到你的 R 包中。

前面那些(除了第 6 点)是你想要写 R 程序包的先决条件(开发链),接下来就是开始写包的节奏了。注意这里不是所谓官方的写法,也不是最完美的写法,写出来也不能够保证能够放到 CRAN 上面啦。但是生成的东西应该是能够被别人安装并且运行的(要求真低 =_=///)。

编写

骨架

library('devtools') # 开发 R 包黑魔法工具
create('~/somebm') # 建立 R 包的目录, somebm 就是你想要的包的名称
setwd('~/somebm') # 把工作目录放到 R 包中的目录,开发 R 包过程中始终推荐这样做。
dir() # 列出当前工作目录的文件和文件夹

以上的过程,就是建立一个最基本的 R 包的目录骨架,并且把骨架文件夹作为当前工作空间。看一下生成的文件夹有什么东西:一个叫 R 的文件夹,一个叫 man 的空文件夹,一个叫 DESCRIPTION 的文件。

添加 DESCRIPTION

实际上,我们最简单(但能用)的 R 包,只需要操作 R 文件夹中的文件,和 DESCRIPTION 文件即可!

简单一点,先看看 DESCRIPTION 文件内容(用代码编辑器或者 file.edit('DESCRIPTION')

Package: somebm
Title: 
Description: 
Version: 0.1
Authors@R: # getOptions('devtools.desc.author')
Depends: R (>= 3.0.2)
License: # getOptions('devtools.desc.license')
LazyData: true

一一填写就可以。比如说我开发包,是关于布朗运动的,想要 MIT 协议发行我的代码,我就把这个文件的内容改成这样:

Package: somebm
Title: some Brownian motions simulation functions
Description: some Brownian motions simulation functions
Version: 0.1
Author: Laowu Wang <[email protected]>
Depends:
    R (>= 3.0.2)
License: MIT
LazyData: true

保存就可以了!如无意外,这个文件不需要再多的改动了!

添加 *.R 文件

接下来我们的关注点就是包文件夹中 R 文件夹中的文件了。

这个文件夹下,应该放着所有的自创的 R 代码。至于怎样放,放到哪个文件中,几乎无所谓,只要(你觉得)有美感,不凌乱,即可。

需要说明的是,在此目录下一个 somebm-package.r<packagename>-package.r)的文件已经被创建了,这个文件应该被保留下作为这个包的描述文件,最好不要放自创函数进去这里。

Talk is cheap。我这里给一个例子。

在此目录中,建立一个叫 bm.R 的文件。由于我这个包是用于模拟布朗运动的,这里把已经写好的模拟布朗运动的函数塞进去,在 bm.R 中写入:

fbm <- function(hurst=0.7, n=100){
  delta <- 1/n
  r <- numeric(n+1)
  r[1] <- 1
  for(k in 1:n)
    r[k+1] <- 0.5 * ((k+1)^(2*hurst) - 2*k^(2*hurst) + (k-1)^(2*hurst))
  r <- c(r, r[seq(length(r)-1, 2)])
  lambda <- Re((fft(r)) / (2*n))
  W <- fft(sqrt(lambda) * (rnorm(2*n) + rnorm(2*n)*1i))
  W <- n^(-hurst) * cumsum(Re(W[1:(n+1)]))
  X <- ts(W, start=0, deltat=delta)
  return(X)
}

保存。在 R 或 RStudio 中运行

#setwd('~/somebm') # 如果之前的 R 环境没有关闭的话,这一步是不需要的。
load_all() # 把包骨架文件夹中的 R 文件夹中的所有 .R 文件读进来
fbm() # 测试自己写的程序
fbm(hurst=0.2, n=1000) # 再测试自己写的程序

load_all() 函数很神奇地把包骨架文件夹中的 R 文件夹中的所有 .R 文件读进来了;每一次你改进你的 *.R 文件,只要运行一次 load_all() 就会把最新的自创函数们拉进来,在 R 环境中就可以测试最新的代码是否正常。

慢着...... 你可能忘记了一些东西......

文档和注释

代码不写注释是万恶之源
-- 阿不思*邓布利多

别的可以省略,文档和注释是绝对不可以省略的。

实际上,R 包规定了每一个(对外)的函数和变量和数据结构,都要有对应的解释等;在 man 文件夹中会有对应的 *.Rd 文件,里面是由奇奇怪怪的东西(LATEX?)写成的。我们可以用比较简洁的方式来写函数注释,然后用一些方法来生成对应的 *.Rd 文件。

具体地说,先修改 bm.R 文件:

#' Generate a time series of fractional Brownian motion.
#'
#' This function generatea a time series of one dimension fractional Brownian motion.
#' adapted from http://www.mathworks.com.au/matlabcentral/fileexchange/38935-fractional-brownian-motion-generator .
#'
#' @param hurst the hurst index, with the default value 0.71
#' @param n the number of points between 0 and 1 that will be generated, with the default value 100
#' @export
#' @examples
#' fbm()
#' plot(fbm())
#' d <- fbm(hurst=0.2, n=1000)
#' plot(d)
fbm <- function(hurst=0.7, n=100){
  delta <- 1/n
  r <- numeric(n+1)
  r[1] <- 1
  for(k in 1:n)
    r[k+1] <- 0.5 * ((k+1)^(2*hurst) - 2*k^(2*hurst) + (k-1)^(2*hurst))
  r <- c(r, r[seq(length(r)-1, 2)])
  lambda <- Re((fft(r)) / (2*n))
  W <- fft(sqrt(lambda) * (rnorm(2*n) + rnorm(2*n)*1i))
  W <- n^(-hurst) * cumsum(Re(W[1:(n+1)]))
  X <- ts(W, start=0, deltat=delta)
  return(X)
}

函数头顶上的一连串注释就是了。注意这种注释是 #' 开头的,会由 devtools 里面的辅助函数来进行处理。先是函数简短说明,再是具体说明,然后是由 #' @param 开头的行就是对每个参数的说明。接下来,对用户使用的函数都要顶上一个 #' @export 行。最后,#' @examples 接下来的行就是示例用法啦。

只要运行

document()

就会生成对应的 *.Rd 文件在 man 文件夹中。

打包

一个命令

build()

就会在与包文件夹平行的文件夹中生成 somebm_0.1.tar.gz 类似的打包文件。可以在 R 环境中使用 install.packages('~/somebm_0.1.tar.gz', type='source') 来安装!

恭喜!你基本完成了一个包了!

忽然间就完结了...... 吗?

经过刚才的步骤,可以说已经建造好一个包了。成就感满满的!

当然,官方文档那些想尽(又臭又长)的说明文是有着更加细致和更加多功能的介绍的。比如说 demo 啊,dataset 啊,test 啊,还有关于 S3 到 S5 的函数啊什么的。以上打包的是可以放到 GitHub 和 R-Forge 上了,但是也许想放到 CRAN 上的话需要更加高的质量和经过一些繁琐的过程。各位有兴趣的还可以继续深入考察!

最后的最后,本文示例的所有代码在这里, man 文件夹和 NAMESPACE 文件都是自动生成的。

use CrunchBang

experience

My Hackintosh crashed a few days before. What's worse, I couldn't install OS X 10.9 Maverick any more because some wired stuff! :-( So in case I want to do something relative with *nix, I wanna install a desktop Linux now. And for safe, I am installing it in a virtual machine firstly.

Long days before, I was accoustomed to use Ubuntu. In fact I started to use it from 08.10. And coincidentally a new version 13.10 was released days before. However, I don't admire the new design (Unity and many other things), and the beautiful Linux Mint hasn't released a new version based on Ubuntu 13.10 yet, so I searched in Google, and discovered the very simple CrunchBang, a distribution based on Debian.

CrunchBang is a Debian GNU/Linux based distribution offering a great blend of speed, style and substance. Using the nimble Openbox window manager, it is highly customisable and provides a modern, full-featured GNU/Linux system without sacrificing performance.

So I installed it in vmware. It's of course similar to Ubuntu, and it's very lightweight and, well, ugly or simple, without any entertainment. :-) At first time, I wanted to try to use the "testing" or "unstable" branch to update my application, but I always met toubles. So in the end, I have to use the "stable" branch, feeling the stable (and old) packages of Debian.

I saved some process in one of my gists. If anyone one is interested in it or want some helps, he or she can leave comments.

additional tips

tip #1 get rid of the gnome3 after dist-upgrade

look at line 1 to line 8 :

tip #2 how to install vmware tools in Debian 7 wheezy / CrunchBang 11 waldorf

I haven't tried to install CrunchBang in VirtualBox. So this tip is only suitable for VMware case.

Attention: in my case, installing the default and official VMware tools will bring much trouble. So I strongly recommend open-vm-tools, an open source tool as the alternate to official VMware tools.

And remember, always reboot the machine to avoid any troubles.

Skype 使用记

Skype 是一个经典的网络电话服务。

这几天,因为一些事情,需要想办法从中国大陆打电话到美国那边。当时我被中国移动 0.80元/6秒 的昂贵花费吓到了,于是义无反顾地选择了 Skype 的方法。由于这方面的中文资料不是很多,因此这里分享一下具体的操作方法。

Skype 中国大陆的业务是由 Tom 来接管的。因此在中国大陆访问 http://www.skype.com/en/ 多半会被转接到 http://skype.tom.com/ 。据说大陆版的客户端有一些奇怪的问题,所以强烈建议大家都下载国际版来用。这里提供 20131010 的国际版的客户端的地址,链接另存为即可。

手机版要到对应手机市场下载。另外 iTunes 要在美国区或者其余非大陆区下载才是国际版的。

然后帐号,可以新建一个,或者按照引导使用 windows live 帐号来开通也可以。

接下来就是充值部分。由于大陆人基本没有使用信用卡习惯,因此 Tom 代理的优势就体现出来了:它提供了网上银行,支付宝,财付通等大陆人喜闻乐见的支付方式。

访问 http://skype.tom.com ,寻找购买的页面。如果不知道买哪种套餐,电话卡等充值,这里强烈推荐和强烈建议购买“欧元卡”,因为他才是和国际版通用的产品。其余的都是大陆特色的电话充值卡。我不知道能否使用到国际版的 Skype 当中。当然如果你明白和知道你在干什么的话,请充分地自由选择。:-) 购买过程中要输入 Skype 帐号,注意不是注册 Skype 时用的电子邮件。付款之后很快就可以在客户端或者网页中查看自己新的帐号余额了。

接下来使用客户端拨打电话就行了。注意国家号那里要选择。另外,上面那些最基础的操作就只能拨出到对方电话而已。对方根据来电显示回拨是做不到的,想要回拨的话请自行研究国际版网页的说明,或者直接买一个便宜的国际通话的真正的电话卡。

使用过后,可以在客户端中查看每一次通话的费用。或者想办法登录到网页中 https://secure.skype.com/account/usage?display=call 查看。据现在的官方网站说法,话费大约是 0.19 人民币每一分钟,详情查看官网。另外,每一次接通都要另外收约 0.5 人民币。当然,某些特殊号码除外,甚至可能不用钱,那些电话类似于国内的 10086 。

我的使用感受是,费用相对中国移动很便宜,语音质量有待提高。当然 Skype 的语音质量应该是同类产品质量最高(之一?)的了。这也可能是我网络问题。我 iPhone 连接 wifi 来拨打,有时会掉线一下,这样就要重新拨打了。后来我改为电脑拨打,连接稳定,不知道是网络问题还是音箱问题,对方声音略小。

总的来说,有需要的人可以参考我这篇文章进行操作。比如说要和国外亲戚交流之类的那些人。

preferring Mac OS X

At last, I installed Mac OS X Mountain Lion 10.8.3 in my Thinkpad T410, though I thought never using hackintosh before. I am getting used to OS X and considering transferring all my working flow into OS X now.

In my old passage, I prefer avoiding hackintosh because of the sucking hardware driver support, and the difficulty to build up the prefect system environment. But the world is developing much faster than that in my opinion, I found some awesome posts about running a prefect OS X on my laptop on [bbs.pcbeta.com], the most advanced hackintosh discussion platform in my country. So I spent a weekend to successfully run smooth Mac system on my PC laptop, with correct *.kext and dsdt etc, although wireless adaptor and SD card reader don't work and won't work. Nothing crashes ever. Yes, it's prefect and stable enough to work on.

So why does OS X attract me?

Firstly, it's beautiful and elegant. Windows 7 is beautiful too, but I have faced it for a long time, and it's time for me to appreciate another kind of art. :-) Ubuntu and other Linux distributions always enjoy ugly desktop. I don't meet hardware issues in Linux, but I am always disappointed about its wired arrangement of windows, strange launch bar and some other stuffs.

Secondly, it's consistent. I mean the style of applications, and the consistent keyboard shortcuts. Most of the applications keep the similar GUI style. (Maybe because all of them are developed by using Cocoa framework?) What also shocks me is the fact that I can guess a keyboard shortcuts in different applications, because the similar tasks in different applications share the same keyboard shortcuts, which contributes to my productivity and happiness a lot.

Lastly, it ensures the productivity. The consistent keyboard shortcuts, the feature that most applications can enter full screen mode, high quality applications, and the unix-like system...... The combination of these advantages ensures the smoothness of working. I can give up mouse most of the time, with the help of system keyboard shortcuts and quicksilver. And I can concentrate on one thing with full screen mode. Some professional (and beautiful) applications such as Photoshop exist only on Mac or Windows. Some developing tools such as command tools run smoothly only on Mac or Linux. (I don't play Windows-only programs such as those rely on .NET at present.) So the software environment of OS X take the advantages of Linux and Windows from my point of view though I have to switch to Windows when I'd like to use online bank. What's more, it surprises me that messages can communicate between GUI applications, which allows me to use some services to improve the productivity more!

Hope the OS X becomes even better in the future. I think I will buy an Apple product next time I want a new computer (after I have earned enough money though).

空山不见人

也算是开始放寒假了,开始有一段比较“长”的时间住在家里。

可是有一些不对劲。

我在家里自己的房间睡醒了,总觉得不对劲。床褥,衣柜,桌面上的公仔依然一样,静默地等待着我的回来,或者根本不在乎我有没有回来,依然沉默地呆在它们呆的地方。我又想到,此刻我放在宿舍的枕头书本,台灯等等,一定也是这样子,在灰尘的积累之中等我回去再见它们。两边的方寸之地带给我的感觉都是一样的。

可是它们不应该一样。我恍然,是房屋里的物事给了我一个陌生感。但这是我的家呀!哪里都可以陌生,这里却应该是最熟悉,闭着眼睛也可以在各房间之间穿梭的地方!

我下午走出家门,在我这个自小长大的小城镇中游弋。在我长大的这些年头,城镇也在成长。哪里的吆喝声不见了,哪里的商铺改头换面,哪里的地方装修了。无论如何,城镇都变化了。老树开了新芽,朋友搬了新家,中学换了名字,牛杂不断升价...... 自己的记忆和老旧的商铺和那个熟悉的城镇仿佛被打包了,仍在了时光的上游,我这个凡人就在洪水中不断冲走。无奈又漠然地看着回忆渐渐模糊,消失的时间的那一头。

自小我便发现,基本上我是不怕生床的。就是说,有很多人在不是自己家的床上都翻来覆去睡不着,嘟嚷着还是自己家的床舒服;而我在宿舍,在朋友家里,在旅游的途中,不会因为这个原因而睡不着。我知道这些地方都是陌生的,但我又知道我只是一个过客,总会在某一天归去那个熟悉的地方。尽管是我本人在外面,在旅游,但我就是一个旁观者,平平静静地看着“我”的游历。

但现在,我在自己的家也感受到了在宿舍的一样的感觉。不是宿舍变得越来越熟悉的感觉,而是家里越来越带来陌生感和疏离感了。

那么,我那个熟悉的地方去哪里了?

我自小向往到处游览的那种浪漫旅行,多么的惬意与潇爽。可是,如果真的一生漂泊流离的话,那就只剩下狼狈与沧桑了。

我怅然若失,尽管我现在还在父母的庇护之下,但我已经不知道为何开始了漂泊流离的生涯了。

一生漂泊,只为了追寻那个逝去而不再回来的时光。