This page describes the iVia Media Type assignment algorithm.
iVia describes the format of each Internet resource with a single Media Type (also known as MIME type) value like text/html or application/pdf. Media Type values are assigned by IANA, who publish an authoritative list.
If an HTTP header is available for a resource, iVia attempts to assign a Media Type based on the Content-type field. In the absence of a header, the Media Type is assigned by using the libmagic library (which is the basis of the common Unix file command) to analyze the document. libmagic determines the Media Type of a using a database of rules that map the unique low-level features of numerous different types to their Media Type. For example, the rule below states that if position 0 in a file contains the string %PDF-, then the Media Type of the file is application/pdf.
0 string %PDF- application/pdf
On the author's Linux workstation, there are 308 rules, which identify 183 different file types.
We have not evaluated Media Type assignment because we do not have a suitable test set.