Friday, March 23, 2012

Keeping Things Fresh

Pop quiz: what’s the difference between the following feed URLs?

  1. https://gdata.youtube.com/feeds/api/users/googledevelopers/uploads?v=2
  2. https://gdata.youtube.com/feeds/api/users/googledevelopers/uploads?v=2&orderby=published
  3. https://gdata.youtube.com/feeds/api/videos?v=2&author=googledevelopers&orderby=published

All three will return a list of videos uploaded in the GoogleDevelopers YouTube channel, with the most recent uploads listed first. However, only the first URL will return the freshest results available — the second or third feeds could both be missing videos that were uploaded within the past few hours. In addition, even if the videos are listed in the second and third feeds, the metadata returned for those videos might not reflect any recent updates.

The reason for this, as explained in our documentation, is that some requests go against our search index, which has cached data, while other requests retrieve data directly from our backend databases, which always contain the most up-to-date data. To determine whether a request will query the search index or the backend database, you can use the following rules of thumb:

  • If your request only includes the max-results and/or start-index query parameters, then it should go against the backend database and the results will be fresh. A few other parameters that change the way the feed is formatted, like prettyprint, callback, or alt, can also be used without triggering the search index. Although it does filter results out of the feed, the fields parameter can also be used while still going against the backend database, because the filtering is performed server-side after the data has been retrieved.

  • If your request contains other parameters, there’s a good chance it will end up against the search index. Some common parameters that will always trigger a search are q and orderby.

Going against the search index isn’t inherently a bad thing. Using the search index is an incredibly efficient way of returning all the videos that match an arbitrary keyword, or ordering a feed of videos so that they’re sorted by view count. The important thing to realize is that the search index doesn’t need to be used for tasks that the backend database can handle, and you’ll get fresher results from the backend database.

Until now we’ve been focusing on retrieving a feed of videos uploaded in a specific account, but these same principles apply to looking up a single video with a given ID as well. Using the information above, can you determine which of these URLs will request a video entry from the backend database, and which will go against the search index?

  1. https://gdata.youtube.com/feeds/api/videos/sOEAD-gfJ_M?v=2
  2. http://gdata.youtube.com/feeds/api/videos?q=sOEAD-gfJ_M?v=2

As you’ve probably figured out, the first URL retrieves the entry for video ID sOEAD-gfJ_M directly from the backend database, while the second URL searches for all entries with metadata containing sOEAD-gfJ_M and then returns the one matching result. The results look similar, but only the first URL will give you the complete, up-to-date video metadata. As such, we recommend always using that syntax when retrieving the entry for a video whose ID you know.

Cheers,
-Jeff Posnick, YouTube API Team