Difference between revisions of "Main Page/Research/MSB/Data processing/how it works"

From phurvitz
Jump to: navigation, search
Line 6: Line 6:
  
 
Once downloaded, use the R script [[/read.msb.files|read.msb.files]] to generate csv files.
 
Once downloaded, use the R script [[/read.msb.files|read.msb.files]] to generate csv files.
 +
 +
To remove duplicate records with duplicate seconds timestamps from the class.csv file, use the perl script [[/msb_remdupes.pl|msb_remdupes.pl]] script.
  
 
The csv files can be related by timestamps (''i.e.'', '''phone_log''' and '''class''' by the '''date''' field).
 
The csv files can be related by timestamps (''i.e.'', '''phone_log''' and '''class''' by the '''date''' field).

Revision as of 22:22, 10 October 2007

Each time the MSB starts, a new session is started.

Downloaded data are stored in separate directories, names part_00, part_01, etc.

Each part should be downloaded and processed separately. Use the script msb.get.data to download all the parts

Once downloaded, use the R script read.msb.files to generate csv files.

To remove duplicate records with duplicate seconds timestamps from the class.csv file, use the perl script msb_remdupes.pl script.

The csv files can be related by timestamps (i.e., phone_log and class by the date field).