Exposome Perspectives Blog

Meta-data, meta-fiction, meta-music, meta-everything

Metadata—it is commonly defined as data about data, but like other “meta” examples that’s too basic and doesn’t tell the whole story. In fact, sometimes a variable can serve as “data” to predict an outcome, and in other cases the same variable is used as “metadata”.

As exposomics grows, we won’t be able to find the information we need unless the scientific community standardizes how we organize the data, particularly if we are to share data and information, such as non-targeted chemical libraries. So while ontologies are important for this standardization, so are the metadata and search engines that must be compatible with the systems of multiple data generators.


Exposome Perspectives Blog by Robert O. Wright, MD, MPH

I am old. I make that point now so that you might excuse some of this post’s historical references (and perhaps explain some of its transient grumpiness). I learned about blogging late in life, because there were no blogs when I was young. There were newspapers, magazines, and free weekly periodicals that came in the mail and were mostly used to wrap fish or as kindling to start a fire. A search engine was your arm moving aside the different periodicals in the newsstand rack. Beyond this stroll through the 1970’s, you may ask “why is this old man writing a blog?” Well, I believe writing this blog allows me to say something useful about the exposome in a novel format, and I have been working in its precursors—environmental health and toxicology—for 30 years. I have witnessed many changes over time and have a fair amount of historical memory. Writing a blog is also a way for me to bring together my interests in literature and music and meld them with science in transdisciplinary metaphors.

My love of writing started in childhood. My mother didn’t learn English till she was an adult, and perhaps because she struggled to write in English, she emphasized the importance of writing to me throughout my childhood. I ended up as an English Literature major at a commuter college. Over time, I realized that my mother saw writing as utilitarian, and not as a career goal. She feared English majors would never get a job and pressured me to switch to science, which I dutifully did. But you can’t shake who you are, and even now I often say to my kids that I write for a living- i.e. grants and papers, so I figured out a way to bring writing into my work life. I came to blogging because I could do it in my off hours, and it lets me play with themes that grants and papers can’t easily consider—such as connections between art, history, and science that are too often hidden.  

Given my overly long introduction, you may ask “what is today’s post about?”-Metadata– i.e. data about data—why it exists, why it is important, and how it is used. How does this relate to literature?  Well, there is a genre of literature called “metafiction” which is fiction about fiction.  You’ve likely read a metafiction story or seen a metafiction movie. One of the better examples is “Harold and the Purple Crayon”.  When Harold encounters an obstacle or need, he writes a story to make it so. He draws sidewalks, dragons, and boats—even pies to eat. But he isn’t the only author, there is another voice telling us the story of Harold. So Harold both writes the story and is the story. The movie “The Princess Bride” is another example of metafiction. Peter Falk is reading a story to his infirmed grandson about Wesley, Buttercup, Inigo and Fezzik the giant. The “Meta” aspect of these stories changes the dynamic and reorients the reader to what is really happening. Using a meta literary trick, each story presents a different, more nuanced meaning. Harold draws the moon and it begins to exist. At the simplest level his world is magical. If the story were told directly from Harold’s viewpoint, he may seem to have impossible powers which no child will ever have. But by telling the story in the third person, the story becomes about Harold and how he is writing his own life—it is more about advice on how to live than about magic. Likewise, “The Princess Bride” has many layers of meaning, but perhaps the most relatable is the universality and diversity of love. The movie presents the many different embodiments of love: Wesley and Buttercup, Inigo and his father, Inigo and Fezzik’s friendship, a grandparent and his grandchild etc.

Music also has meta perspectives. I previously wrote about the Magnetic Fields’ song “The Book of Love” which is a metasong that illustrates the many faces of love as we age and experience life, but does so by singing about a book and not about love or life. In the first verse, the singer praises someone, perhaps a parent, who reads them the Book of Love, ending with the line, “You can read me anything.” In the second verse, the song tells us the Book of Love contains music and instructions for dancing, and the final line is “You can sing me anything.” The final verse is about gifts, and the singer progresses to “You can give me anything” and then adds “You ought to give me wedding rings,” and you realize the listener has experienced the evolution of love across life and the “book of love” is actually a metaphor for life. Stephin Merritt (who writes most of their songs) also wrote the very meta “Two Characters in Search of a Country Song” which references country music classics.

“You were just like me, you were one big bruise
In the game of life we were playing to lose
You were Jesse James, I was William Tell
You were Daniel Webster, I was the Devil himself

Two characters in search of a country song
Just make-believe, but so in love
Two characters been listening all night long
For voices from Nashville above”

If you are wondering why Nashville is above them, reread the first verse.  There are other “meta-music” examples too, (i.e. “Glass Onion,” by the Beatles, is a song about other Beatle’s songs, giving you clues about their meaning, and Carly Simon’s lament that your vanity made you think her song was about “you” is also very meta). As I said- I am old, so are my song references, but in every case of “meta” fiction or music, the meta orientation changes the meaning or provides additional context to the reader or listener.

By the way, this is a blog post about writing blogs, see what I did?  This is a metablog post! My long introduction may seem more intentional and less meandering now that you know that.

On to Metadata—it is commonly defined as data about data, but like other “meta” examples that’s too basic and doesn’t tell the whole story. In fact, sometimes a variable can serve as “data” to predict an outcome, and in other cases the same variable is used as “metadata”. Biological sex can be used as a predictor variable (i.e. as data). But if you use sex as a block variable when randomizing blood samples to ensure equal numbers of men and women on each assay plate, you used it as metadata.

The first examples of using metadata as a search tool actually come from your local library. Even in ancient times libraries recognized that as they grew larger and larger, finding a single book became beyond the ability of anyone’s memory, so search systems had to be created.

If you are old (again like me), you may remember the Dewey Decimal System from your school library which used metadata about books—title, author, subject—along with alphanumeric numbers to guide you in locating them. This may be the most important use of metadata, especially as the volume and complexity of information produced by exposomics (air pollution, extreme weather, chemicals, wearable devices, stress etc) are increasing exponentially, making the navigation of this exciting information extraordinarily challenging. Alliteration aside, the alphanumeric system invented by Melville Dewey standardized how different libraries organized books is a roadmap for using exposomic metadata. Just as the Dewey Decimal System is used to locate information in a physical library, using agreed upon digital metadata standards could capture and locate digitalized exposomic information. We need to agree upon what are the metadata variables (age, calendar year, place, sex, occupation, etc) that will serve as key words for searching exposomic databases. These variables could tell us “the chemical data in this data set come from New York, in Women working in Office settings in the year 2023.” Such metadata would give users knowledge about dataset content in a structured, predictable way—ranging from descriptive information such as variable names, descriptors and types, to administrative information such as usage restrictions, and technical information such as the assay method or questionnaire instrument used.

Perhaps the primary value of metadata is in assisting researchers in finding relevant information and discovering critical extant resources. This practice is now commonplace and helps our research to be discovered by others. Authors of journals provide “keywords” which are used as “meta tags” by search engines, such as Pubmed or Google Scholar. Metadata can be used to organize digital resources, providing unique identifiers, and supporting the preservation and archiving of information. Before Dewey, libraries used their own methods to organize information so if you went to the next town’s library, you couldn’t find the books you wanted because the search system was different. In studying the exposome, there are and will be thousands of related databases with enormous amounts of useful, but disconnected data. Searching for this data is extraordinarily difficult. As exposomics grows, we won’t be able to find the information we need unless the scientific community standardizes how we organize the data, particularly if we are to share data and information, such as non-targeted chemical libraries. So while ontologies are important for this standardization, so are the metadata and search engines that must be compatible with the systems of multiple data generators. Just as Metafiction tools reorient readers to more specific meanings of stories, metadata orients data users to different meanings depending on the goal of their analysis, which is after all, the story their data tell.  Metadata tells us the history of the data, the methods by which it was collected, what quality assurance/quality control procedures and many, many other useful facts. Instead of sorting through all chemical data in a dataset, with metadata you can refine your search to particular settings, methods of collection, age groups, and calendar year.

All of these are examples of the key role that metadata will play in exposomics. These systems are coming, perhaps too slowly but eventually the need will overcome the inertia to build them. The future is already here in many ways, as there are examples in our digitalized society of the value of metadata. Businesses now use digital metadata standards to organize supply chains from producers to consumers—an infrastructure that grew out of need. Likewise, libraries already organize books digitally, and the Dewey card system is largely gone. That means that books of love, like the Princess Bride, can be found using your fingertips with a simple keystroke. So in the end, perhaps we should all say, “Thank you, Metadata, for all you do.”