Difference between revisions of "Background information for data providers"

From TV-Browser Wiki
Jump to: navigation, search
 
Line 45: Line 45:
  
 
'''4.''' Run the ''MirrorUpdater''.
 
'''4.''' Run the ''MirrorUpdater''.
 +
 +
[[de:Hintergrund Datenaufbereitung]]
 +
 +
[[Category:Development]]

Latest revision as of 15:09, 7 June 2009

This article contains background information on the topic Providing TV listings with a Primary Data Service.

How the tools work

Get and prepare raw data

The tool PDSRunner starts the parsers which fetch the program data from the channels and converts the data in a TV-Browser specific format. These data files are stored in the raw directory.

Create diff files

The tool PrimaryDataManager looks for differences between the new data in the raw directory and the already existing data in the prepared directory. If differences are found the tool creates update files.

To keep the traffic for data providers low, the data is divided into different files:

  • "base" contains time, title and data as actors etc.
  • "more00-16" contains the descriptions for programs between 0.00 and 16.00
  • "more16-00" contains the descriptions for programs between 16.00 and
  • "picture00-16" contains die Bilder for programs between 0.00 and 16.00
  • "picture16-00" contains die Bilder for programs between 16.00 and 0.00
  • For days with > 255 programs there are also "additional" files

The only alway required file is "base".

Each program gets an ID to assign the entry in the "base" file with the correct entries in the "more" and "picture" files. The ID's are also necessary to keep the data consistent in the "update" files.

Update files

On the first run "full" files are created. If the PrimaryDataManager finds changes in the data, it creates "update" files as needed. These update files contain the ID of the program and the new data. The changes will also be written into previous update files and the first "full" version.

Upload the data

The tool MirrorUpdater loads the files and some additional information onto the mirrors.

Tips for data providers

Since the tools create diff files by using the data in the "prepared" directory, you must never delete this directory. Otherwise the data becomes inconsistent between the "prepared" directory and the files on the mirrors and the files the users already downloaded.

If the "prepared" directory has been deleted accidentally

If the "prepared" directory has been deleted for some reason, then you might have the following possibilities to keep the damage low.

All of the following hints are a bit experimental and might not work as expected and might even mess up things more than before. So always create backups of all data before doing one of the following steps and make yourself aware of what you are doing.

1. Make a backup of the "prepared" directory, then delete its content.

2.a) - if there has been at most one update since the deletion and since then no files have been uploaded onto the mirrors, you can copy the files from the "backup" directory into the "prepared" directory. But make sure that the files in the "backup" directory really are those that existed before the deletion of the "prepared" directory. (The "backup" files should contain more "update" files than the "prepared" files.)

2.b) - if there has been more than one update (this means also the "backup" directory contains the wrong files), or if you are not sure how many update have been done since the deletion, or if files have been uploaded onto the mirrors since then: Copy all the files from the mirror into the "prepared" directory.

3. Start the PrimaryDataManager. If it succeeds hopefully everything is fine again. If it fails with messages saying something like "Converting Day program (..) failed" and "Program frame with ID x has no start time" and you did step 2.b), do the following:

For each channel where the PrimaryDataManager fails, start the PrimaryDataManager with the argument "-forceCompleteUpdate" followed by the channels name. The PrimaryDataManager will then create new ID's for all programs (on all days) of this channel. This will result in a lot of new "update" files but the inconsistency of the data will hopefully be fixed.

4. Run the MirrorUpdater.