Difference between revisions of "Program data quality"

From TV-Browser Wiki
Jump to: navigation, search
(Just a lot of small mechanical and idiomatic fixes.)
m (Program Data Quality)
Line 3: Line 3:
 
The program data is received from the press office of each broadcast station, and is automatically processed; hence the quality depends on the data obtained.
 
The program data is received from the press office of each broadcast station, and is automatically processed; hence the quality depends on the data obtained.
  
Some broadcast stations provide XML-Data with structured information, while from others, we receive the information in a proprietary and unstructured format like RTF.
+
Some broadcast stations provide the data as structured XML.  From others, we receive the information in an informal and  proprietary format, like RTF.
  
As a result, in the data from some broadcast stations, for example, the actors are separated, yet not in the case of others.
+
As a result, for example, in the data from some broadcast stations, the actors are listed individually, but not in that of others.
  
 
Some stations include information about the audio format (mono, stereo, dual or Dolby-Surround), while others don't.
 
Some stations include information about the audio format (mono, stereo, dual or Dolby-Surround), while others don't.
  
Some provide additional information, such as cast blurbs, but which can't be separated automatically, so it appears within the broadcast description.
+
Some provide even more information, such as cast blurbs, which information is nevertheless unstructured, and therefore can't be parsed out automatically — hence, it appears within the general broadcast description.
  
Sometimes the classification of the data is unavailable (movie, quiz-show, etc.) or, if it is available, this data is again presented in a proprietary format.  Beyond that, the terminology itself is not standardized (e.g., one station speaks of a "Documentary," while another will call it an "Info-Show").
+
Sometimes the classification of the data is unavailable (movie, quiz-show, etc.).  If it is available then, once again, it is often presented in a proprietary format.  Beyond that, the terminology itself is not standardized (e.g., one station speaks of a "Documentary," while another will call it an "Info-Show").
  
 
Since TV-Browser is free of charge, we're unable to reformat all that data.
 
Since TV-Browser is free of charge, we're unable to reformat all that data.

Revision as of 15:15, 31 January 2008

Program Data Quality

The program data is received from the press office of each broadcast station, and is automatically processed; hence the quality depends on the data obtained.

Some broadcast stations provide the data as structured XML. From others, we receive the information in an informal and proprietary format, like RTF.

As a result, for example, in the data from some broadcast stations, the actors are listed individually, but not in that of others.

Some stations include information about the audio format (mono, stereo, dual or Dolby-Surround), while others don't.

Some provide even more information, such as cast blurbs, which information is nevertheless unstructured, and therefore can't be parsed out automatically — hence, it appears within the general broadcast description.

Sometimes the classification of the data is unavailable (movie, quiz-show, etc.). If it is available then, once again, it is often presented in a proprietary format. Beyond that, the terminology itself is not standardized (e.g., one station speaks of a "Documentary," while another will call it an "Info-Show").

Since TV-Browser is free of charge, we're unable to reformat all that data.

If you'd like to send us data, or if you'd like to know how the data is processed, you can find additional information in our tutorial.