Data management is a critical component of any scientific experiment. How that data is managed and manipulated is a critical component for maintaining transparency related to the experiment. As such, the details of this text are meant to clearly describe how data from the experiment is obtained, stored and manipulated to provide the charts and graphs presented in the associated paper regarding the experiment and the website that describes the experiment.
Data Acquisition
When the experiment is set up and the experimenter is ready to obtain data, the experimenter uses a data logger for pulling the raw data from the experiment. This data logger utilized by this experimenter is “add-on” software associated with the “Virtins Technology Multi Instrument.” The specific model used for this experiment was the VT DSO-2A20E USB. This “multi instrument” has the capability of providing various measurement instruments from a single device and the user then purchases each instrument separately when buying the physical device. The computing aspect of the device is related to software that is installed on the user’s computer thus reducing the size and cost of the device. The experimenter purchased a package that included an oscilloscope, a spectrum analyzer and data logger. Measurements made by the oscilloscope were recorded by the datalogger during the experiment.
The experiment consists of nine measurements of which each measurement lasts for approximately ten seconds. Measurements 1-3 are obtained with the instrument pointed toward the star Regulus. Measurements 4-6 are obtained with the instrument pointed 90 degrees away from the star Regulus. Measurements 7-9 are obtained with the instrument returned to the original position where it is pointed toward the star Regulus.
The instrument data logger is set to obtain the peak (max) values of the sine wave that is generated along the Lecher line or for the Vrms values of the sine wave that is generated along the Lecher line. Sometimes the experimenter will log both of these measurements at the same time during the experiment. Data is obtained with the oscilloscope settings showing a single channel, A, with a sampling rate of 50 MHz and a bit setting of 16 bits.
Once the data logger has been set up to record the specific type data (Max and Vrms), the OK button is pushed, and the data logger will start logging data. The data logger does not automatically stop logging data after a specific period of time. As such, the experimenter must measure his time and manually stop the data logging. Due to the inability to stop at an exact time, the number of data points from one measurement may be different than the number of data points from another measurement. In general, with a sampling rate of 50 MHz the number of data points obtained in a single measurement for a period of 10 seconds is close to 200 data points obtained per measurement.
Data Recording
The data obtained is saved by the data logger and sent to the computer as a .log file. This file contains a column that lists the number of the data point sequentially as it was obtained. The next column lists the exact time the data point was obtained. The next column shows the voltage value for each data point obtained. A column is saved for each type of data requested by the datalogger. As such, when Peak (max) voltage is obtained and when Vrms are obtained, each set of values has it’s own column which describes the voltage measured from the oscilloscope during the experiment. At the end of an experiment, a total of nine .log files are created by the data logger and stored on the computer.
Data Manipulation
Data manipulation starts with the data being converted to a file that can be utilized by Excel. As such, all .log files are manually converted to .csv by simply rewriting the file extension in the computer.
Next the each .csv file is individually opened. The voltage data points are copied from the original file and copied to a new .csv file that will collect all the voltage data from all nine experiments. The number of the datapoint and the time of the datapoint collection is not copied into the new file. When all data is copied and pasted into the new file, the file will have nine columns with each column representing the voltage reading from the lecher line when the data was collected. If Peak (max) and Vrms are both collected, 18 columns will be present in the new file. The name of the new files will have date and time of the experiment as part of the name of the file.
At the bottom of the columns of data, it will be noted that the number of data points is not equal for each set of data points. This is because the experimenter has to manually stop obtaining data and the time differential slightly changes the number of data points. To correct for this, the experimenter looks to find the column with the least number of data points. When this is determined, the last few data points from every data set are deleted so that each column has the exact number of data points. This deletion is done without consideration of the value of the data in these columns. This raw data is stored on this website where it is publicly available for review. Once this is completed, there is no further manipulation of the data and it is ready for evaluation by the experimenter.
Conclusion:
For those scientists who choose to replicate this experiment, it is highly suggested that the data manipulation be exactly as described. The loss of a few data points is insignificant as long as the process is non-biased and is used exactly the same way for each experiment. If you have questions or concerns, please contact me via this website with your comments.
Rene Steinhauer