Difference between revisions of "Main Page/Research/MSB/Data processing/how it works"
From phurvitz
< Main Page | Research | MSB | Data processing
Phil Hurvitz (talk | contribs) |
Phil Hurvitz (talk | contribs) |
||
Line 6: | Line 6: | ||
Once downloaded, use the R script [[/read.msb.files|read.msb.files]] to generate csv files. | Once downloaded, use the R script [[/read.msb.files|read.msb.files]] to generate csv files. | ||
+ | |||
+ | To remove duplicate records with duplicate seconds timestamps from the class.csv file, use the perl script [[/msb_remdupes.pl|msb_remdupes.pl]] script. | ||
The csv files can be related by timestamps (''i.e.'', '''phone_log''' and '''class''' by the '''date''' field). | The csv files can be related by timestamps (''i.e.'', '''phone_log''' and '''class''' by the '''date''' field). |
Revision as of 22:22, 10 October 2007
Each time the MSB starts, a new session is started.
Downloaded data are stored in separate directories, names part_00, part_01, etc.
Each part should be downloaded and processed separately. Use the script msb.get.data to download all the parts
Once downloaded, use the R script read.msb.files to generate csv files.
To remove duplicate records with duplicate seconds timestamps from the class.csv file, use the perl script msb_remdupes.pl script.
The csv files can be related by timestamps (i.e., phone_log and class by the date field).