old/published/Webmonkey/APIs/youtube_data_api.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102

Last time around we looked at the YouTube Player API which allows you to customize, skin and otherwise control the playback of YouTube videos. 

Now it's time to explore the YouTube Data API, which you can use to request and store info about movies you'd like to display on your site.

There are a variety of client libraries available for the youTube Data API, including [http://code.google.com/apis/youtube/developers_guide_php.html PHP], [http://code.google.com/apis/youtube/developers_guide_dotnet.html .NET], [http://code.google.com/apis/youtube/developers_guide_java.html Java] and [http://code.google.com/apis/youtube/developers_guide_python.html Python].

We'll be using the latter, but the general concepts will be the same no matter which language you use.

== Getting Started ==

Let's say you frequently post movies to YouTube and you're tired of cutting and pasting the embed code to get them to show up on your site.

Using the YouTube Data API and some quick Python scripts, we can grab all our movies, along with some metadata and automatically add them our database. For instance, if you followed along with a Django tutorial series, this would be handy way to add YouTube to your list of data providers.

To get started go ahead and download the [http://code.google.com/support/bin/answer.py?answer=75582 Python YouTube Data Client Library]. Follow the instructions for installing the Library as well as the dependancies (in this case, ElementTree -- only necessary if you aren't running Python 2.5)

Now, just to make sure you've got everything set up correctly, fire up a terminal window, start Python and try importing the modules we need:

<pre>
>>> import gdata.youtube
>>> import gdata.youtube.service
</pre>

Assuming those work, you're ready to start grabbing data.

== Working with the YouTube Data API ==

The first thing we need to do is construct an instance of the YouTube Data service. Entry this code at the prompt:

<pre>
yt_service = gdata.youtube.service.YouTubeService()
</pre>

That's a generic object, with no authentication, so we can only retrieve public feeds, but for our purposes that's all we need. First let's write a function that can parse the data we'll be returning.

Create a new text file named youtube_client.py and paste in this code:

<pre>
import gdata.youtube
import gdata.youtube.service

class YoutubeClient:
    def __init__(self):
        self.yt_service = gdata.youtube.service.YouTubeService()
    
    def print_items(self, entry):
        print 'Video title: %s' % entry.media.title.text
        print 'Video published on: %s ' % entry.published.text
        print 'Video description: %s' % entry.media.description.text
        print 'Video category: %s' % entry.media.category[0].text
        print 'Video tags: %s' % entry.media.keywords.text
        print 'Video flash player URL: %s' % entry.GetSwfUrl()
        print 'Video duration: %s' % entry.media.duration.seconds
        print '----------------------------------------'
    
    def get_items(self, feed):
       for entry in feed.entry:
           self.print_items(entry)
</pre>

Now obviously if you want to store the data you're about to grab, you need to rewrite the <code>print_items</code> function to do something other than just print out the data. But for the sake of example (and because there are a near infinite number of ways your database could be structured) we'll just stick with a simple print function for now.

So make sure that <code>youtube_client.py</code> is on your PythonPath and then fire up Python again and input these lines:

<pre>
>>> from youtube_client import YoutubeClient
>>> client = YoutubeClient()
>>> client.get_items(client.yt_service.GetMostLinkedVideoFeed())
</pre>

The last line should produce a barrage of output as the client prints out a list of most linked videos and all the associated data. To get that list we used one of the YouTube service modules built-in methods <code>GetMostLinkedVideoFeed()</code>.

Okay, that's all well and good if you want the most linked videos on YouTube, but what about ''our'' uploaded videos?

To do that we're going to use another method of YouTube service module, this time the <code>GetYouTubeVideoFeed()</code> method.

First, find the video feed url for your account, which should look something like this:

<pre>
http://gdata.youtube.com/feeds/api/users/YOURUSERNAME/uploads
</pre>

So let's plug that into our already running client with these two lines:

<pre>
url = 'http://gdata.youtube.com/feeds/api/users/YOURUSERNAME/uploads'
client.get_items(client.yt_service.GetYouTubeVideoFeed(url))
</pre>

You should see a list of all your recently uploaded videos, along with all the metadata we plugged into our <code>print_items()</code> function.

== Conclusion ==

Hopefully this has given you some insight into how the data API works. We've really just scratched the surface, there are dozens of methods available to retrieve all sorts of data -- see the [Python YouTube Data API guide] for more details.

While we've used Python, the methods and techniques are essentially the same for all the client libraries, so you should be able to interact with YouTube via a language you're comfortable with.

Obviously you'll need to adjust the <code>print_items()</code> function to do something better than just printing the results. If you're using Django, create a model to hold all the data and then use the model's <code>get_or_create()</code> method to plug the data in via <code>print_items()</code>.

For full automation, write a shell script to call the methods we used above and attach the script to a cron job.

And there you have it -- an easy way to add YouTube videos to your own personal site, with no manual labor on your end.