Subscribe to
Posts
Comments
You've arrived at Everything is Miscellaneous's blog page that was active 2008-2012. You'll find links to some useful information about the book and its subject matter, but don't be surprised by some dead links, etc.
To order a copy, go to your local bookstore, or Amazon, etc.
For information about me, David Weinberger, click here.
To visit the page underneath this text, click here.

Thanks - David Weinberger

Libraries of the future

We’ve just posted the latest Library Innovation Lab podcast, this one with Karen Coyle who is a leading expert in Linked Open Data. Will we have perpetual but interoperable disagreements about how to classify and categorize works and decide what is the “same” work?

And, if you care about libraries and are in the Cambridge (MA) area on Oct. 4, there’s a kick off
http://osc.hul.harvard.edu/yopc/content/kick-event-harvard-library-strategic-conversations”>event at Sanders Theater at Harvard for a year of conversations about the future of libraries. Sounds great, although I unfortunately will be out of town :(

The National Archives is going all tag-arrific on us:

The Online Public Access prototype (OPA) just got an exciting new feature — tagging! As you search the catalog, we now invite you to tag any archival description, as well as person and organization name records, with the keywords or labels that are meaningful to you. Our hope is that crowdsourcing tags will enhance the content of our online catalog and help you find the information you seek more quickly.

Nice! (Hat tip to Infodocket for the tip)

The National Archives is going all tag-arrific on us:

The Online Public Access prototype (OPA) just got an exciting new feature — tagging! As you search the catalog, we now invite you to tag any archival description, as well as person and organization name records, with the keywords or labels that are meaningful to you. Our hope is that crowdsourcing tags will enhance the content of our online catalog and help you find the information you seek more quickly.

Nice! (Hat tip to Infodocket for the tip)

Linked Open Data take-aways

I just wrote up an informal trip report in the form of “take aways” from the LOD-LAM conference I attended a cople of weeks ago. Here is a lightly edited version.

 


Because it was an unconference, it was too participatory to enable us to take systematic notes. I did, however, interview a number of attendees, and have posted the videos on the Library Innovation Lab blog site. I actually have a few more yet to post. In addition, during the course of one of the sessions (on “Explaining LOD-LAM”), a few of us began constructing a FAQ.

Here’s some of what I took away from the conference.

– There is considerable momentum around linked open data, starting with the sciences where there is particular research value in compiling huge data sets. Many libraries are joining in.

– LOD for libraries will enable a very fluid aggregation of information from multiple types of sources around any particular object. E.g., a page about a Hogarth illustration (or about Hogarth, or about 18th century London, etc.) could quite easily aggregate information from any data set that knows something about that illustration or about topics linked to that illustration. This information could be used to build a page or to do research.

– Making data and metadata available as LOD enables maximal re-use by others.

– Doing so requires expertise, but should be less massively difficult than supporting many other standards.

– For the foreseeable future, this will be something libraries do in addition to supporting more traditional data standards; it will be an additional expense and effort.

– Although there is continuing debate about exactly which license to use when publishing library data sets, it seems that usually putting any form of license on the data other than a public domain waiver of licenses is likely to be (a) futile and (b) so difficult to deal with that it will inhibit re-use of the data, depriving it of value. (See the 4-star license proposal that came out of this conference.)

– The key point of resistance against LOD among libraries, archives and museums is the justified fear that once the data is released into the world, the curating institutions can no longer ensure that the metadata about an object is correct; the users of LOD might pick up a false attribution, inaccurate description, etc. This is a genuine risk, since LOD permits irresponsible use of data. The risk can be mitigated but not removed.

Schema.org

Bing, Google and Yahoo have announced schema.org, where you can find markup to embed in your HTML that will help those search engines figure out whether you’re talking about a movie, a person, a recipe, etc. The markup seems quite simple. But, more important, by using it your page is more likely to be returned when someone is looking for what your page talks about.

Having the Big Three search engines dictating the metadata form is likely to be a successful move. SEO is a powerful motivator.

At the Linked Open Data in Libraries, Archives and Museums conf [LODLAM], Jonathan Rees casually offered what I thought was useful a distinction. (Also note that I am certainly getting this a little wrong, and could possibly be getting it entirely wrong.)

Background: RDF is the basic format of data in the Semantic Web and LOD; it consists of statements of the form “A is in some relation to B.”

My paraphrase: Before LOD, we were trying to build knowledge representations of the various realms of the world. Therefore, it was important that the RDF triples expressed were true statements about the world. In LOD, triples are taken as a way of expressing data; take your internal data, make it accessible as RDF, and let it go into the wild…or, more exactly, into the commons. You’re not trying to represent the world; you’re just trying to represent your data so that it can be reused. It’s a subtle but big difference.

I also like John Wilbanks‘ provocative tweet-length explanation of LOD: “Linked open data is duct tape that some people mistake for infrastructure. Duct tape is awesome.”

Finally, it’s pretty awesome to be at a techie conference where about half the participants are women.

Discovery, the metadata ecology for UK education and research, invites stakeholders to join us in adopting a set of principles to enhance the impact of our knowledge resources for the furtherance of scholarship and innovation…

What follows are a set of principles that are hard to disagree with.

The GiveALink link-sharing site has posted two games thaty are actually research studies.

The first game is GiveALink Slider which the site says “is an interesting online tagging game in which you must annotate webpages with related tags and choose new webpages. You can accumulate points and win badges by accomplishing tasks and building links with other players.” They are giving iPods to the winners. It’s actually a study called “Social Annotations through Game Play” conducted by the Networks and Agents Network in the Center for Complex Networks and Systems Research of the Indiana University School of Informatics
Here’s the description of the second game:

Great Minds Think Alike is a word association game that lets users build semantic concept networks and explore similarity relations.

Players form a chain of semantically related words, which comes from the GiveALink knowledge base. Users can browse through nine different social media, e.g. Flickr and Youtube, and earn points.

Words are geo-tagged, which helps to analyze the geographical distribution of terms. Players can also connect with other players via Facebook as suggested by the game.

Data from the game is collected by GiveALink.org to make the game more fun, support other social tagging applications, and for study purposes.

No, I don’t actually understand how either game works, and I haven’t signed up for them because the first one is a study that I don’t want to commit to and the second requires an iPhone. But, the GiveALink service is interesting. It’s an open bookmark-sharing service that also feeds a research program. [Hat tip to Julianne Chatelain.]

As PR for an upcoming appearance by James Gleick, whose new book The Information I am greatly looking forward to reading, Zocalo Public Square asked four or five folks “Can there be too much information?” It’s an interesting collection of responses. (Well, mine excepted.)

And underneath these interesting-in-themselves essays runs a different question when they are taken together: What the heck do we mean by “information” anyway? I’m not sure any of the respondents is defining it in the same way. The ways include: opinions, raw data, words, ideas, photos, switches and dials, and books. Of course, some of these are containers of information or examples of information. But they do not reduce to a single definition. (I believe Gleick’s book is at least in part about this ambiguity about information. It’s also something I’ve been researching for the past couple of years.)

As far as my contribution goes, I had to decide whether to provide an Everything Is Miscellaneous answer (we are learning to organize info in new ways) or a Too Big to Know answer (the quantity of info is changing the nature of knowledge). I went with the new book rather than the old, if only because I wrote the tiny essay within minutes after finishing revising the book manuscript.

[2b2k] Tagging big data

According to an article in Science Insider by Dennis Normile, a group formed at a symposium sponsored by the Board on Global Science and Technology, of the National Research Council, an arm of the U.S. National Academies [that’s all they’ve got??] is proposing making it easier to find big scientific data sets by using a standard tag, along with a standard way of conveying the basic info about the nature of the set, and its terms of use. “The group hopes to come up with a protocol within a year that researchers creating large data sets will voluntarily adopt. The group may also seek the endorsement of the Internet Engineering Task Force…”

« Prev - Next »