WebAnnotator Manual

What is WebAnnotator?

WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension, allowing annotation of both offline and inline pages. The HTML rendering is fully preserved and all annotations consist in new HTML spans with specific styles. WebAnnotator provides an easy and general-purpose framework and is made available under CeCILL free license (close to GNU GPL — see the license text), so that use and further contributions are made simple.

All parts of an HTML document can be annotated: text, images, videos, tables, menus, etc. The annotations are created by simply selecting a part of the document and clicking on the relevant type and subtypes. The annotated elements are then highlighted in a specific color. Annotation schemas can be defined by the user by creating a simple DTD representing the types and subtypes that must be highlighted. Finally, annotations can be saved (HTML with highlighted parts of documents) or exported (in a machine-readable format).

WebAnnotator will be presented at LREC conference, if you use it, please cite this reference.

Installing WebAnnotator

Getting started

You can play with the add-on by importing sample files ne.dtd or chunking.dtd and annotating text as shown in section "Annotating pages". These two sample files are also provided in the samples directory of the extension directory.

The first time you begin an annotation session, you might not see the bottom panel. It is because it has a height of zero. Just drag the horizontal line at the bottom and resize the panel.

Creating a new annotation schema

Annotation schemas can be specified to WebAnnotator by importing a DTD into the extension.

For example, the DTDs corresponding to the two sample annotation schemas provided with the extension are:

ChunkingNamed Entities
<!-- Three high-level annotations types : NP, VP, PP -->
<!ELEMENT NP (#PCDATA)>
<!ELEMENT VP (#PCDATA)>
<!ELEMENT PP (#PCDATA)>
<!-- Four high-level annotations types : person, org, location, date -->
<!ELEMENT person (#PCDATA)>
<!ELEMENT org (#PCDATA)>
<!ELEMENT location (#PCDATA)>
  <!-- Attributes for locations, no default -->
  <!ATTLIST location type (river|mountain|city|country) #IMPLIED>

<!ELEMENT date (#PCDATA)>
  <!-- Attributes for location, default is "absolute date" -->
  <!ATTLIST date type (date|time|duration) #REQUIRED
              rel (absolute|relative) absolute>

After choosing the DTD file, you can specify options, as the name and description of the schema, as well as colors of your annotations. All these options can be set later by clicking on the Options buttons in the WebAnnotation panel (bottom of the page when the extension is activated).

Annotating pages

For annotating a specific Web page, select Choose Schema in the WebAnnotator button menu. It is only possible to annotate one document at a time, that is why the button is not activated on other Firefox tabs when a session has begun.

WebAnnotator panel

At the beginning of the session, a panel will open at the bottom of the page. This panel will contain the annotations when you create them, as well as a bar for different settings.

Note that all links in the HTML document are deactivated by default, in order to allow easy selection of text. You can reactivate links by clicking on Activate Links in the right bottom of the page.

Creating new annotations

Once the annotation schema is annotated, creating new annotations is very simple. Select the elements (text, images, etc.) that you want to annotate, and choose their type in the frame that opens.

The new annotation will then appear in the panel table.

Editing and removing annotations

You can edit or delete a specific annotation by two different ways. The first method is to find your annotation in the panel and to click on one of the two icons at the left. Selecting the annotation in the panel table will make the annotation blink in the HTML document.

The second method is the put your mouse over the annotation itself, in the HTML document. The element will start blinking and the two same icons will appear.

Saving & Exporting

Two output formats are available in WebAnnotator, namely save format and export format.

In case you want to keep local copies of all linked URIs in the document (images, CSS, scripts, etc.), check the Save linked URIs box in save dialog box.

Annotating with multiple schemas

You can annotate a Web page with a given schema, save the result, and then annotate with another schema. You can switch between schemas as many times as you want. Choosing to "keep color information" when saving your annotations will result in all annotations being visible at the same time. On the other hand, by uncheking this box you will see only the annotations of the currently activated schema.

Known issues

Do not hesitate to contact me if you find any annoying bug in this add-on. Do not hesitate either if you wish to participate to the add-on evolution.
  1. During an annotation session, closing the windows will lead to two exit messages. If you answer differently to both of them, this can lead to a state of the system where the extension is not usable any more. You then need to reboot Firefox to be able to use it again.
  2. Links inside iframes are not deactivated.
  3. Annotation by triple-click selection does not highlight the good text span. Problems on over-selection annotation by double-click have been reported but not reproduced.

Citing WebAnnotator

If you use WebAnnotator as an annotation tool for research, please cite:

WebAnnotator, an Annotation Tool for Web Pages. Xavier Tannier. Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012, to appear). Istanbul, Turkey, 2012.