Community Content
From Flux Developer Portal
Content Discovery
Content Discovery is the process where flux creates a content record and unique ID for externally hosted content. Ideally this is done proactively, before users interact with your content. This ensures optimal widget functionality and metadata presentation within the flux universe.
flux's Content Discovery System (CDS) allows external website content to be ingested and rendered correctly on any flux-powered site. The discovery process will always bring in the title and generate a thumbnail for each piece of content and additionally ingests the content itself for rendering back on flux sites. The way this content is ingested is controlled by you, the partner, so for example it is possible to display only a snippet of the body of a post and force the user back to the original website to consume the entire post.
Content Discovery can be triggered in 2 ways:
- The website owner provides flux with a sample URL permalink from which content will be discovered, during sign up.
- The website owner provides flux with an RSS feed of the content to be discovered, during sign up.
In both cases, the CDS is tuned to correctly ingest the content and display it to users in an optimal output format.
Setting up Content Discovery
In order to integrate the community services provided by the flux platform, it is necessary to expose metadata for your content so that it can be discovered by flux. The reason for doing this can be illustrated by this use case:
- UserXYZ is watching a video (VideoID 23456) on your external site.
- Ths user likes the video and clicks the "add to my feed" button. This button is actually a flux widget.
- Upon clicking the button, a call is made to the flux back end to add the video to the user's feed (remember that all user data is stored in the flux system). Included in the call are the UserID of the user who clicked the button and an identifier for the piece of content that should be added to the feed.
- When flux processes the "add to my feed" request, it will perform a lookup on the content identifier to see whether or not it has any information about it. If not, it will invoke its content discovery engine to make a request back to a pre-determined page on your site to discover the metadata about the video. The request will include the content identifier as a query parameter. The response will be an XML document with the metadata for the identifier. In this example, the document returned will information about the video (ie, title, artist, thumbnail image, permalink, etc.).
- The next day, a user on flux.com goes to view UserXYZ's profile page after finding them in a search. UserXYZ's feed is displayed on their profile page, and it includes the video from your site that they added. Through the content discovery machinery, flux.com is able to display basic information about the video along with a link back to your site to watch it.
- The content identifier referred to above is a string which follows the Content URI Specification. So in the example above, the content identifier passed to flux via the "add to my feed" widget is
mgid:uma:video:[yoursite.com]:23456.
The main goal of the discovery process is to provide flux with enough information about a community's content to enable information about it to be displayed on pages generated by the flux back end (the User Profile page, for example). Typically, the required metadata includes title, description, thumbnail, and a permalink to the content detail page on the owning community's site.
flux Content Types
flux supports a limited set of top level content 'Types':
- Video (appears in user profile My Videos Widget)
- Photo (appears in user profule My Photos Widget)
- Post (appears in user profile My Links Widget)
Note: Post is the default 'catch-all' content type for external content. It is often used in conjunction with a custom content Alias.
Defining Content Types
Change the URI structure which is output by your external CMS to something like mgid:cms:[content_type]:[siteURL.com]:179561.
Example: mgid:cms:video:sitename.com:12345
Or, dynamically append a "fluxtype" prefix (i.e. fluxtype:photo) to the URI to identify content type. The flux content discovery system will use the prefix to determine content type (it will be built into the discovery rule), but will then drop this and store only the original URI
This can easily be done by flux widgets when building the content URI to use via some simple js code.
Example: fluxtype:photo:mgid:cms:item:mtv.com:179561, where the URI remains mgid:cms:item:mtv.com:179561. Similarly, for videos you could use: fluxtype:video:mgid:cms:item:mtv.com:11785.
Content 'Alias'
Communities may optionally specify a content 'Alias' (label) for their content during Content Discovery. If specified, the Alias will be displayed within the user's activity feed, e.g., if a video item is given the Alias of "mashup", a commenting activity item would read: "[Display Name] commented on an mashup from [Community Name]".
- Aliases should be predefined and submitted to flux as part of the content discovery configuration.
- Content may be filtered by Alias within DAAPI and FML data requests.
Defining Content Aliases
Aliases should be set via the Discovery XML document. E.g., <alias>Picture Set</alias>
Exposing Content to the Content Discovery Engine
There are four methods by which content can be exposed to flux's content discovery engine: default, smart HTML scraping, RSS feed, and custom XML delivery. The engine is configured such that each community can specify its own method of discovery. Each of these four options is described in more detail below.
Method 1: Custom XML Delivery
This method requires a configuration request
Content is identified to flux by a Content URI. When the discovery engine encounters a Content URI for the first time, it will hit a page (the URL of which is specified by the community), passing the URI as a parameter. The document returned by that page is expected to be an XML document containing the relevant metadata for the content. The community will also specify which elements contain the necessary bits of metadata. A typical XML document will look like the following:
<?xml version="1.0" encoding="iso-8859-1" ?> <video uri="mgid:uma:video:mtv.com:180550"> <title>Will.I.Am - "Video Envy"</title> <alias>Episode</alias> <description>He's got Mad Madonna jealousy!</description> <thumbnail width="100" height="100" src="http://www.mysite.com/img/cool_video_thumb.jpg" /> <link>http://www.mysite.com/videos/view/12345</link> <embed src="[URL]" width="414" height="338" type="application/x-shockwave-flash" FlashVars="CONFIG_URL=[URL]" allowFullScreen="true" AllowScriptAccess="never" base="." /> <categories> <category>Category 1</category> <category>Category 2</category> </categories> </video>
<?xml version="1.0" encoding="iso-8859-1" ?> <post uri="mgid:uma:post:mtv.com:180550"> <title>Will.I.Am - "Video Envy"</title> <alias>News Article</alias> <description>He's got Mad Madonna jealousy!</description> <thumbnail width="100" height="100" src="http://www.mysite.com/img/cool_video_thumb.jpg" /> <link>http://www.mysite.com/videos/view/12345</link> </post>
| Element | Required | Description |
|---|---|---|
| title | yes | Content title |
| alias | no | Label/Alias, as defined by publisher (MTV) during content discovery, will appear wherever content Alias is used. For instance, Activity Feed, Share Feed, etc. This will not change the content's core 'content type' |
| description | yes | Content description |
| thumbnail | yes | Content thumbnail. This image will be imported into the flux during the discovery process |
| link | yes | Permalink / redirect URL back to content on publisher site |
| embed | no | Embed code used for media preview within flux hosted page previews (e.g., preview from a user's My Shared Items module); also used in sites where flux hosts video content detail pages, such as Real World Dailies |
| categories | no | Content may be discovered into existing flux categories. These categories must be pre-created by the Community Admin within the flux Community Manager interface. |
Method 2: Default (reactive discovery)
If a community has not specified a method of content discovery, then this method is used. In this case, a community's content is identified to flux by permalink (URL). The content discovery engine will retrieve metadata about the content by making a request for the URL and scraping the HTML that is returned as follows:
- title is taken from the page's
<title>element. - thumbnail image is taken from a screenshot of the page.
- description is also taken from a screenshot of the page.
- permalink is the URL of the content itself.
- In this case, content discovery happens in real time; in other words, content is discovered by flux only when necessary. For example, if a user hits a video page for the first time, and that page has a comment widget attached to the video, then the content will be discovered when the widget is loaded.
Method 3: Smart HTML Scraping
This method is similar to the Default method described above, but it allows the community to tell the content discovery engine how to parse the HTML to find each of the specific bits of metadata. For example, a community might be configured to expose content metadata as such:
- title is taken from the page's
<title> element. - thumbnail image is taken from the src attribute of the
<img>tag whose id is "ThumbnailImage" - description is taken from the "description"
<meta>tag. - alias (optional) may be defined via an "alias"
<alias>tag. - permalink is the URL of the content itself.
- Note that this is only an example of how a community might set up its content discovery. In practice, each community would define its own rules.
As with the above method, content discovery happens in real time.
Method 4: RSS Feed
With this method, content is discovered by flux via an RSS feed, the URL of which is provided by the community. Once per hour, the discovery engine will hit the RSS feed URL to get a list of new and/or updated content. Each item in the feed that has not yet been consumed by flux will be ingested.
With the first two methods, the discovery engine is invoked in real time (ie, flux discovers content as users encounter it). With this method, discovery happens offline. If a user encounters a piece of content that has not yet been discovered by flux, then the Default discovery method is invoked; in this case, the metadata will be overwritten when it is encountered in the RSS feed.
Triggering Content Discovery
There are a few different methods for triggering content discovery:
- Drop a flux widget onto a non-flux content detail pages
- Manually enter the permalink via the flux Community Manager administration tool. See Triggering manual content discovery for flux hosted pages
- Use the flux Data Access API (DAAPI). You can pass flux a MGID via the following sample DAAPI query URL to trigger discovery: http://daapi.flux.com/2.0/00001/XML?[Community_UCID]/feeds/content/?q=mgid:uma:content:shared:1234
Making Content Discoverable on a Dynamo Hosted Site
When flux makes a request to discover a piece of content, it will hit whatever URL you have configured for your community's content discovery. The typical pattern is: http://YOUR_SITE/sitewide/data/xml/access/entity.jhtml?uri=CONTENT_URI
This page includes another page that does most of the work. The source for this page can be found in the CODE branch of TeamSite here: CODE/docs/data/xml/access/entity.jthtml
When you look at the JHTML source for this page, you'll see that it calls a droplet to resolve the content URI and then Switches on the type of the resolved entity. If the Switch does not include an oparam for the type of content you need to be discoverable, you will have to add it here first. You can follow the format of the existing oparams. For example, to add support for an entity of type "foo":
The first thing that occurs is to call the include.jhtml file. This droplet checks to see if a site-specific override exists for the entity type by looking to see if a file exists here in your site's document root: /sitewide/data/xml/access/FILE
In the path above, FILE is the value of the file parameter passed to include.jhtml. If the file exists, then it is rendered. If not, this file is rendered: /global/data/xml/access/FILE
In the path above, FILE is the value of the file parameter passed to include.jhtml. This file MUST exist, so when you add content discovery support for a new type of content, you must at a minimum create this file. In doing so, keep in mind that anything is /global is shared by all sites, so you should never put any site-specific logic or content here.
So what does one put in this file? You can put whatever metadata about the content you want, but it must be well-formed XML. You MUST NOT INCLUDE the <?xml ... ?> declaration, as this (and the appropriate Content-Type header setting) are handled by entity.jhtml. Exactly how much metadata is needed will be determined by the business requirements for displaying the metadata on flux.com. However, it's best to limit it and include only what is needed.
Discovery for Multiple Communities
When content discovery documents for different communities need to be served from the same site, the typical pattern is to replace "sitewide" the community name. So if content discovery for your site is served from: http://www.yoursite.com/sitewide/data/xml/access/entity.jhtml
the content discovery url for your second site, also served from yoursite.com would be: http://www.yoursite.com/yoursecondsite/data/xml/access/entity.jhtml
It's also important that the uri structure for the document itself be unique. flux configures unique content discovery rules against unique URI structures. For example, mgid:uma:video:yoursite.com:23456 is the URI structure for yoursite.com, but mgid:uma:video:yoursite3.com:23456 is the uri structure for yoursite3.com.
Using Content URIs with flux Widgets in JHTML pages
Any flux widget that is tied to a piece of content (examples include the Comment and ContentAction widgets) will require a parameter that tells the widget the URI of the content to which it is being attached. To build this URI in a JHTML page, use the EntityUriCreator droplet. This droplet requires one parameter that specifies that entity for which to generate a Content URI. For example, say we need to generate a Content URI for a CMS Item, but we only have the CMS ItemID. First use the CMS GetItem droplet to lookup the item, then pass it to the EntityUriCreator droplet:
Date Formats for Content Discovery
We'd like to see the dates in one of the following formats if time will not be included.
- 2009-04-17
- 4/17/2006
And if time will be included, then use one of the following time formats:
- 2009-04-17 2:22:48 PM
- 2009-04-17 2:22:48 PM -6
- 4/17/2006 2:22:48 AM +6
- 4/17/2006 2:22:48
Preferred formats are the top ones in the respective lists above.
Please note that if you don't specify time zone (+ or - sign and GMT offset), date and time will be applied as server's local time (Pacific, USA).
Triggering Manual Content Discovery for flux Hosted Pages
Sometimes, it's necessary to manually discover a single piece of content into a flux community in order to make it available for commenting, rating and sharing.
This will give you a quick tutorial on how to do that.
Log in to your flux community as the administrator. After you've logged in, click on the More button on the QuickMenu and select "Community Media". Direct link to this page is http://\[Community\].flux.com/profile/\[Admin Username]/Content/MyMedia.aspx In the search box at the top of the page, paste a unique instance of either: A content URI that points to XML describing the content (i.e. ://www.mtv.com/sitewide/data/xml/access/entity.jhtml?uri=mgid:uma:video:mtv.com:215746) A content permalink (i.e. http://www.comedycentral.com/videos/index.jhtml?videoId=177572&title=starting-out-tommy-davidson) Click on the "Import From URL" button on the top of the page.
Notes:
This assumes you've already configured the content discovery feed to work with your flux community. If the feed is not set up to work within your community, you will likely end up discovering an XML document. See Setting up Content Discovery for more details on how to do that.
If the community already has discovered your content, it will appear to do nothing. This means the content is already discovered within your community.
