metrics, Chiron

Chiron(Provisional content, more to come) This is a long-lived and long-term project, which has been completely refactored in its history, and is now approaching its third major release.

Chiron is a software system for the analysis of several different languages, poetical traditions, metres and texts, even if its major applications to date have taken place in Greek and Latin metrics; thanks to its modular architecture, you can plugin specialized components into several extension points, also for different stages of the analysis itself. This is a two-step process, where data collection is totally separated from data interpretation. The first step happens by chaining any number of com­ponents (analysis layers), each performing a specialized task, from phonology to syntax up to metrical scansion; users are free to create their own chain by leveraging builtin and/or third-party components. Each layer adds new data (traits) to a sequence of segments representing the phono­logical analysis of the input text, and there is no limit for defining and adding such traits. Layers are unaware of each other, but they often require data collected by lower layers to do their job.

In Greek and Latin metrics, a syntax layer is required between the phonological and metrical one, as its main job is detecting appositives and clitics, whose correct treatment has a huge impact on any collected data. Even if the theory behind these concepts is rather complex, this difficulty cannot be an excuse for recurring to the simplistic and totally misleading solution of equating wordends with printed spaces: in my papers I have tried to show both aggre­gated and detailed examples for this view.

Finally, once also the metrics layer has comple­ted, analysis data are saved into a repository to be used by data interpretation components. These are composed into a set of observers, each specialized in observing one and only one phenomenon; users can build their own set of observers by picking them from builtin components or third-party plugins. All their observations are stored into a metrics corpus, a true database which can be later queried with any complex expression to get all the data variously combined for any type of research. This provides a sort of live metrics laboratory available for testing hypotheses, and refining the analysis methods themselves.

This is essential especially where scholars may be interested in a set of data usually much larger than the scope of a specific investigation, right because of the fact that metrics builds patterns from language features, which in turn are highly dependent on each other and change during time. Thus, the possibility to examine the whole set of data from any perspective and combine them freely is of paramount importance for their interpretation.

You can find a small, live demo here about syllabification: just type some word in any of the available languages, and click the button (or press Enter) to have it fully analyzed into its "segments" and "traits" and see the corresponding syllabic division, based onto a model essentially (but not only) relying on phonetic openings (have a look at the chart to grasp the concept at a glance).