Getting Started with Media Search


This quickstart demonstrates a simple way to get started using the Clarify API. Following these steps, it should take you no more than 5-10 minutes to have a fully functional search for your audio.

Configuring Your Environment

While you can use any programming language you choose, we provide a few helper libraries to get you started. In most cases, you can use your favorite package manager:

  • curl
  • PHP
  • Java
  • Python
  • Ruby
Although we don't have a curl library, the command-line JSON parser 'jq' is super helpful. Download and install it to get started: http://stedolan.github.io/jq/
You can install the Clarify SDK using Composer. In composer.json add:
{
    "require": {
        "clarify/clarify-helper": "~2.0"
    }
}
And then simply run:
composer install

# Don't forget to use sudo if appropriate.
In Java, you simply install the Clarify SDK via your Maven pom.xml:
<dependency>
    <groupId>io.clarify.api</groupId>
    <artifactId>clarify-api-sdk</artifactId>
    <version>1.0.0</version>
</dependency>
In Python, you simply install the Clarify module via pip:
pip install clarify_python

# Don't forget to use sudo if appropriate.
In Ruby, you simply install the Clarify gem and set your API key as an environment variable:
gem install clarify

export CLARIFY_API_KEY=abcde12345

# Don't forget to use sudo if appropriate.

Loading Audio

First include the SDK and create your object using your API Key. Once you’ve created the object, you can use the object to load each of your audio files as shown:

  • curl
  • PHP
  • Java
  • Python
  • Ruby
curl --data "media_url=http://media.clarify.io/audio/books/dorothyandthewizardinoz_01_baum_64kb.mp3" \
     --data "notify_url=http://example.org/sample-receiver" \
     --data "name=Dorothy and the Wizard of Oz" https://api.clarify.io/v1/bundles \
     --X POST --header "Authorization: Bearer myapikey" | jq '.'
# The jq portion is optional and just used to pretty print the resulting json
<?php

require 'vendor/autoload.php';

$bundle = new Clarify\Bundle('my api key');
$result = $bundle->create('Dorothy and the Wizard of Oz', 'http://media.clarify.io/audio/books/dorothyandthewizardinoz_01_baum_64kb.mp3');
import io.clarify.api.*;
import java.net.URI;

public class App {
    public static void main(String[] args) throws Exception {
        String appKey = "my api key";

        // Construct the API client
        ClarifyClient client = new ClarifyClient(appKey);

        // Create your first bundle using an example audio file
        String name = "Dorothy and the Wizard of Oz";
        URI mediaUrl =  URI.create("http://media.clarify.io/audio/books/dorothyandthewizardinoz_01_baum_64kb.mp3");
        Bundle bundle = client.createBundle(name, mediaUrl);

        System.out.println(bundle.id());
    }
}
from clarify_python import clarify

client = clarify.Client('my api key')

client.create_bundle(name='Dorothy and the Wizard of Oz',
        media_url='http://media.clarify.io/audio/books/dorothyandthewizardinoz_01_baum_64kb.mp3')
require 'clarify'
require 'pp'

# This assumes that you set up your key in the environment using: export CLARIFY_API_KEY=myapikey
clarify = Clarify::Client.new(api_key: ENV['CLARIFY_API_KEY'])

created_bundle = clarify.bundles.create!(
    name: 'Dorothy and the Wizard of Oz',
    media_url: 'http://media.clarify.io/audio/books/dorothyandthewizardinoz_01_baum_64kb.mp3'
)

pp created_bundle

Naming the bundle and providing a notify_url are both optional. We have a number of audio and video files available for processing on our Media Page.

Note: You don't have to download these files. Instead you can pass the urls via the create/POST method shown above.
After creating a bundle, you'll receive a response which looks something like this:
{
    "id":"abcde12345",
    "_class":"Ref",
    "_links":{
        "self":{
            "href":"/v1/bundles/abcde12345"
        },
        "curies":[
            {
                "href":"/docs/rels/{rel}",
                "name":"clarify",
                "templated":true
            }
        ],
        "clarify:metadata":{
            "href":"/v1/bundles/abcde12345/metadata"
        },
        "clarify:tracks":{
            "href":"/v1/bundles/abcde12345/tracks"
        },
        "clarify:insights":{
            "href":"/v1/bundles/abcde12345/insights"
        }
    }
}

Searching Audio

Note: While we process files as we receive them, there will be a delay before your file is available for searching. This is normally 1 minute for every minute of audio/video.

To search, you use the same object you created before and just search using your keywords. If you uploaded the “Wizard of Oz” audio clip, you can search for “dorothy”. Then you can process and interact the results however you wish.

The code below simply shows the resulting bundle id, bundle name, and the start/end offsets for each occurrence of the search terms:

  • curl
  • PHP
  • Java
  • Python
  • Ruby
curl https://api.clarify.io/v1/search?query=dorothy \
    --header "Authorization: Bearer myapikey" | jq '.'
# The jq portion is optional and just used to pretty print the resulting json
<?php

require 'vendor/autoload.php';

$bundle = new Clarify\Bundle('my api key');
$page = $bundle->search('dorothy');

$results = $page['item_results'];
$items = $page['_links']['items'];
foreach ($items as $index => $item) {
    $_bundle = $bundle->load($item['href']);

    echo $_bundle['_links']['self']['href'] . "\n";
    echo $_bundle['name'] . "\n";

    $search_hits = $results[$index]['term_results'][0]['matches'][0]['hits'];
    foreach ($search_hits as $search_hit) {
        echo $search_hit['start'] . ' -- ' . $search_hit['end'] . "\n";
    }
}
import io.clarify.api.*;
import java.net.URI;

public class App {
    public static void main(String[] args) throws Exception {
        String appKey = "my api key";

        // Construct the API client
        ClarifyClient client = new ClarifyClient(appKey);

        // Search your media by query string
        String query = "dorothy";
        BundleSearchResults bundleSearchResults = client.searchBundles(query);
        JSONArray itemResults = bundleSearchResults.getItemResults();

        for(int i=0;i&lt;itemResults.length();i++) {
            JSONObject item = (JSONObject)itemResults.get(i);
            System.out.println("score="+item.get("score"));
        }
    }
}
from clarify_python import clarify

client = clarify.Client('my api key')

result = client.search(query='dorothy')
results = result['item_results']
items = result['_links']['items']

for item in items:
    bundle = client.get_bundle(item['href'])

    print(bundle['name'])

    search_hits = results[index]['term_results'][0]['matches'][1]['hits']
    for search_hit in search_hits:
        print(str(search_hit['start']) + ' -- ' + str(search_hit['end']))
require 'clarify'

clarify = Clarify::Client.new(api_key: 'docs-api-key')

results = clarify.bundles.search('dorothy')

results.each do |bundle_results, bundle_url|
    # Fetch the bundle:
    bundle = clarify.get(bundle_url)

    puts "#{bundle.name} - #{bundle_url}"
    bundle_results['term_results'].each do |term_result|
        term_result['matches'].each do |match|
            type = match['type']
            match['hits'].each do |hit|
                puts "\tmatched #{type} content at #{hit['start']} to #{hit['end']}"
            end
        end
    end
end

And here are the results using the Wizard of Oz clip we loaded:

/v1/audio/8ee0e56929c248ba895d19ead47c9993
Dorothy and the Wizard of Oz
2.04 -- 2.53
15.44 -- 16.09
172.44 -- 172.79
192.05 -- 192.45
224.76 -- 225.07
235.43 -- 236.02
271.52 -- 271.89
329.1 -- 329.56
390.09 -- 390.46
406.8 -- 407.17
480.47 -- 480.87
512.95 -- 513.25
Note: Your results may be slightly different as our systems use machine learning and are improving constantly.

Putting it All Together

From here, we can visualize our search results with our included audio player. The player should work with minimal additional configuration, but the bulk of the logic is already above in the results.

  • PHP
  • Python
<?php

require 'vendor/autoload.php';

$bundle = new Clarify\Bundle('my api key');
$items = $bundle->search($terms);

$search_terms = json_encode($items['search_terms']);
$item_results = json_encode($items['item_results']);

$audiokey = $items['_links']['items'][0]['href'];
$tracks = $bundle->tracks->load($audiokey)['tracks'];
$mediaUrl = $tracks[0]['media_url'];
from clarify_python import clarify
import json

clarify.set_key('my api key')

result = clarify.search(query='dorothy')
search_terms = json.dumps(result['search_terms'])
item_results = json.dumps(result['item_results'])

bundleref = result['_links']['items'][0]['href']
bundle = clarify.get_bundle(bundleref)
tracksref = bundle['_links']['o3v:tracks']['href']
tracks = clarify.get_track_list(tracksref)['tracks']
mediaURL = tracks[0]['media_url']
Fork me on GitHub