CFP: Composition as Big Data

Computational analysis of big data has changed the way information is processed. Corporations analyze patterns in what people buy, how far they run, where they spend their time; they quantify habits to create more effective advertisements and cross-promotions. In academe, humanities scholars are using computational analysis to identify patterns in literary texts, historical documents, image archives, and sound, all of which has added to the body of knowledge in humanities theory and methodology. Meanwhile, many institutions and writing programs are adopting learning management systems that may digitally archive hundreds – if not thousands or tens of thousands – of student compositions from across levels and disciplines. What is our responsibility, and what is the potential, in harnessing big-data methods as composition researchers, teachers, and administrators?

Composition and rhetoric scholars have begun to adopt corpus-based computational analysis both to better understand the field as a whole – through the rhetoric of job postings (Lauer), professional journals (Mueller; Almjeld et al), and dissertation records (Miller; Gatta) – and to research student compositions, the teaching of which is the primary job of most composition and rhetoric scholars. Through data-driven studies of student entrance exams (Aull), citation practices (Jamieson and Moore Howard), revision practices (Moxley), and acknowledgment of counterarguments (Lancaster), scholars have found patterns that distinguish student writing from published academic writing, suggesting areas to target for instruction.

This edited collection will model and reflect on the research made possible by high-capacity data storage and computation, either alone or in conjunction with close reading and evaluation in context. Authors are invited to submit abstracts for chapters that focus on the rhetoric, methods, and findings of recent large-scale data studies of writing. We are especially interested in contributions that include replicable practices and/or detailed descriptions of method, with an eye toward graduate-level research, teaching, or administrative applications in the intersecting fields of digital humanities, linguistics, and composition.

The following list of topics and questions is not exhaustive, but suggestive, illustrating the range of issues to be taken up:

  • Data Capture and the Captivation of Data
    • When we say “big data” in composition what do we mean? What datasets are available, promising, or already producing insight?
    • What new questions do these datasets allow us to ask or answer? What are their limitations?
    • How has data gathered from large corpora of (student) writing changed the scholarship and practice of composition / rhetoric? How might such data do so in the future?
  • Responsible Research
    • Who is responsible for creating or curating datasets in composition? How might the answers change at different scales?
    • What are the ethical responsibilities of anyone storing, retrieving, or analyzing composition data – perhaps especially where students and their writing are concerned?
    • How, should researchers negotiate issues of consent and representation when recording or reporting on data? How is this affected by the scale or scope of the data?
  • Discourse and Discovery
    • How can computational tools aid in the qualitative coding of (student) writing? How do these practices relate to traditional coding methods?
    • What data-supported models of writing practices emerge from the study of digital corpora?
    • What does or can big data show about the nature of expertise and learning in the context of composing?
  • Pedagogical Practices
    • How can the field of composition / rhetoric use data to positively impact pedagogical or andragogical practices? For example, how can data-supported studies improve composition instruction in higher education?
    • What is the relationship between distant and close reading in regard to assessing student writing? Can and/or should distant reading practices be applied to assessment at the undergraduate level, and in what ways?
    • What role can analysis of big data play for student researchers in composition / rhetoric?
  • Supporting a Data-Supported Future
    • What standards or best practices are emerging for data archiving, aggregation, and interoperability?
    • How might those new to big-data approaches most usefully manage issues of scope or  documentation?
    • How can we best support new researchers, teachers, or administrators in developing comfort with big-data approaches and insights? What does a successful program of big-data training look like?

Abstracts of approximately 350 words should provide, in as much detail as possible, the focus and argument(s) for the proposed chapter. Abstracts and brief bios are due 1 August 2017 via Google Forms at Questions can be directed to Amanda Licastro ( or Ben Miller ( with the subject line “Composition as Big Data.”

The Editors

Amanda Licastro is an Assistant Professor of Digital Rhetoric at Stevenson University in Maryland. Amanda’s fields of research include digital humanities, composition and rhetoric, textual studies, and interactive technology and pedagogy. Recent publications include a “The Problem of Multimodality: What Data-Driven Research Can Tell Us About Online Writing Practices” in Communication Design Quarterly, a co-authored chapter on “Collaboration” in Digital Pedagogy in the Humanities: Concepts, Models, and Experiments, a webtext in the 20th anniversary edition of Kairos, “The Roots of an Academic Genealogy: Composing the Writing Studies Tree” with Ben Miller and Jill Belli, and her dissertation “Excavating ePortfolios: What Student-Driven Data Reveals about Multimodal Composition and Instruction,” which won the Calder Award for Digital Humanities. Amanda is on the Editorial Collective of The Journal of Interactive Technology and Pedagogy, and is the co-founder of The Writing Studies Tree. You can follow Amanda on Twitter @amandalicastro.


Benjamin Miller is an Assistant Professor of Composition at the University of Pittsburgh, focusing on digital research and pedagogy. He is the lead developer of the Writing Studies Tree, a crowdsourced, open-access database of academic genealogies in Composition/Rhetoric and related fields, tracing connections among scholars and institutions along lines of mentorship, education, collaboration, and employment; he has written about the WST, with Amanda Licastro and Jill Belli, in Kairos 20.2 (2016). He is also the author of “Mapping the Methods of Composition/Rhetoric Dissertations: A ‘Landscape Plotted and Pieced,’ an article drawing on big data and data visualization techniques, published in CCC in 2014. A founding editor of the open access Journal of Interactive Technology and Pedagogy, Ben continues to be an active member of its editorial collective. He received a CCCC Chairs’ Memorial Scholarship in 2012, and a CCCC Emergent Research/er Award in 2017 for Distant Readings of Disciplinarity: Knowing and Doing in Composition/Rhetoric Dissertations. He has taught writing at Pitt, at Hunter College, CUNY, and at Columbia University. You can find Ben on Twitter at @benmiller314.

Teaching Empathy Through Virtual Reality


In Philip K. Dick’s Do Androids Dream of Electric Sheep the U.N. secretary proclaims, “[m]ankind needs more empathy” (1968). The poignancy of Dick’s novel is its accurate expression of the social challenge of diminishing human empathy. The author offers empathy as the defining characteristic of humanity. As is often the case, science fiction foreshadows our future: longitudinal studies show decreasing rates of empathy in college students over the last three decades. If we believe that empathy is indeed a vital quality, then humanists are uniquely qualified to address this decline: extensive research suggests that empathy can be taught, specifically by reading fiction. Furthermore, preliminary trials indicate that virtual reality (VR) effectively evokes feelings of empathy in viewers. In both cases, the medium can provide the audience with access to situations outside of their everyday experience, offering a perspective into the lives of people unfamiliar to the reader/viewer. Take, for example, the work of documentary filmmaker Chris Milk that immerses the viewer in war torn villages in order to impact immigration policy (see “How virtual reality can create the ultimate empathy machine,” 2016) or the content of the New York Times VR application which addresses a wide variety of social justice issues from all over the world. However, as critics such at Janet Murray rightfully argue, the impact of VR is dependent on the execution, which is still in development stages: “[t]he technical adventurism and grubby glamour of working in emerging technologies can make it hard to figure out what is good or bad from what is just new” (“Not a Film and Not an Empathy Machine,” 2016). As the digital humanities has encountered with other emerging technologies – most notably data visualization techniques – these new forms need to be critiqued as they evolve (Drucker, 2012). Inviting students and educators to collaborate with industry professionals in the process of consuming, critiquing, and creating open access VR content creates the opportunity to design thoughtful immersive experiences that may address the decline in empathy in college age students. This presentation will explicate a study-in-progress devised to measure the pedagogical impact of VR content in combination with design thinking assignments used to combat desensitization and evoke empathy across the disciplines.

This research is supported with a case study of students in a series of linked courses at a small liberal arts college in Baltimore, MD. Students were exposed to VR content intended to increase their feelings of empathy for people who represent the “Other” in various ways, such as gender, race, ethnicity, and social class. This study was created through a cross-campus collaboration between faculty from the humanities, social sciences, and school of design alongside the theater director and librarians. Using empathy as the central question, each course integrated VR content and related readings into the curriculum. In each case, VR provided access to experiences not possible within the classroom space, for example an immersion into a refugee camp, a simulation of the human brain, and a documentary depicting gender bias across cultural contexts. The VR was scaffolded into each course in discipline-specific ways. For instance, the literature courses focused on readings that depict representations of virtual bodies in tandem with theory on posthumanism, particularly the work of Katherine Hayles and Donna Haraway. At the same time, the theater program produced The Nether by Jennifer Haley, which raises questions about the laws governing virtual spaces through depictions of pederasty and the murder of young children. Simultaneously, courses in psychology and human services integrated VR to discuss the impact of immersive content on social justice reform, and nursing courses looked at the application of VR for patient care and education. To varying degrees, this work was supplemented with readings on feminism, race theory, and disability studies in order to support discussions of “othering” with students. After analyzing the VR content in conjunction with the course materials, students were  asked to design a VR experience intended to evoke empathy in the context of a discipline-specific audience. Additionally, members of a local VR company contributed as guest speakers and offered internships for interested students. Surveys were distributed at the beginning and end of the semester which prompted students to define, discuss, and debate empathy. At the end of each course students were interviewed to identify which methods of engagement increased their empathy toward people (in some cases characters) they felt were unlike themselves in significant ways.

As a part of this submission the syllabi and assignments will be shared. Ideally, the speaker will bring a VR headset and gaming laptop so participants can experience and consider how this emerging technology can evoke empathy by providing access to geographical, cultural, political, and biological content unfamiliar to the viewer. The goal is to receive audience feedback on the first stage of this study in order to improve and refine the methods before executing the plan on a larger scale. This study is IRB approved and student consent will be obtained for any student work that is presented.

#MLA15 Presentation

Tales from a Silver Medalist: Publishing an Interactive, Collaborative Article in JITP

The following post contains the slides and transcript from my presentation at the Modern Language Association convention held in Vancouver in January 2015. This presentation was accepted as part of a panel on scholarly communication, here is the call:

What Does It Mean to Publish? New Forms of Scholarly Communication   

Combining the immediacy of a blog post with the rigor of a refereed journal, “middle state” publishing is gaining ground in the humanities. How does middle-state publishing — also known as “grey literature” — challenge our notions of what makes something “published”? Scholars wrestle with the import of this question for hiring, tenure, and promotion decisions, while librarians, archivists, and the MLA International Bibliography struggle to document and preserve emerging forms of scholarly communication. This session seeks papers that will engage with the question of what it means to publish. How might institutional repositories constitute a form of publication? How do new tools and methodologies suggest new categories for indexing and analysis? How do new categories of scholarly publication challenge and change how we keep the scholarly record? How do we archive emerging material?

[slideshare id=43176797&doc=mla15jitplicastro-150103183341-conversion-gate01]

After a brief introduction and series of thank yous that accompanied my first slide, here is the text that accompanied my visuals:

First, a little about the journal in general. The Journal of Interactive Technology and Pedagogy (JITP) is open access and entirely online, built on a customized site on the CUNY Academic Commons – which may be a familiar form to many of you since it is the same platform as the MLA Commons. JITP was founded at the Graduate Center, CUNY as a potential space to showcase the work of doctoral students in the Interactive Technology and Pedagogy Certificate Program, and as a student in that program I was invited to be a member of the editorial collective before the first issue launched. The conversations that we, the editorial collective, have had over the past three years have challenged and expanded my conception of scholarly production. Theses issues range from philosophical and ethical, to practical and structural: for example, we grapple with questions of copyright, permission, archiving, and indexing, but also issues concerning what types of media to accept, how to stipulate “article length” in new media submissions, what citation style to maintain, and how to mentor authors whose submissions show potential but don’t quiet meet our criteria. As a new journal in the relatively new world of open access publishing, many of these questions have very few or no examples to provide precedence. The journals we do look to include Kairos, whose editor Cheryl Ball as been very generous in her mentorship.

JITP is working to remix the scholarly journal in a myriad of ways, concentrating on enacting a publication model in which both the form and process adapt to meet new modes of composition. The image here is the JITP mission statement – itself an evolving text – and you can see here, as in our title and twitter handle, our emphasis is on pedagogy. Focusing on a transparent and collaborative peer-review process, this presentation will chronicle the production of the article “Digital Literary Pedagogy” as an example of how editors of online academic journals can work with contributors to expand the definition of publication in innovative ways.

Let’s start with the way our Editorial Collective works. We maintain a balance of students, faculty members, and staff (including librarians, deans, and administrators). We also strive to have two members edit each issue, with one student and one full-time faculty or alt-ac member paired together. This isn’t as easy as it seems considering many of our students have landed excellent jobs across higher education. Which is wonderful! However, this goal remains central to our mission because it is through these relationships that we model the mentorship we hope to extend to our authors.

Screenshot 2015-01-16 10.52.38

As you can see from this graphic, authors are brought into our process as fully as can be expected considering the many layers of labor that occur behind the scenes. Each submission goes through a minimum of three stages of review – all of which are as transparent and open as possible – starting with a double “not-blind” review. The authors receive letters from their reviewers that suggest changes, and then many are assigned to a specific collective member that works with the author to enact those revisions. This is followed by copy-editing and style and structure reviews, which is a subject for another talk. This talk is about a project inspired by the relationship between authors and editors. Roger Whitson, an assistant professor of English at Washington State University, (coincidentally presenting at this same time in another session) wrote to JITP asking if we would like to collaborate with him on a semester-long undergraduate course he was teaching on technologies of reading in the nineteenth century. Roger’s original pitch was grandiose but exciting, and really appealed to our desire to be innovative in both form and process. Along with Kimon Kerimidas, Assistant Professor and Director of the Digital Media Lab at the Bard Graduate Center, I volunteered to be a part of this experiment.

Screenshot 2015-01-16 12.46.45

The timeline you see here is actually the active navigation for the resulting webtext – and will help me provide a narrative for this presentation. You can go to the live site on your own to see this in action. In the first week of class Roger introduced the assignment: the students were to create a collaborative digital project based on the course content as a mock submission to our journal. Then Kimon and I used Google Hangout to speak with the class about the basics of academic publishing, digital scholarship, and JITP. The class then went to work reading, writing, and building webtexts to showcase their work. At the end of the term Roger shared the student’s projects with us (they are publicly available on the course website), and Kimon and I critiqued them as if we were reviewing them for our journal. The students then presented their projects to us live over Google Hangout, and we then explained our feedback to the class in response. Meanwhile Roger, Kimon, and I documented this entire process in a co-authored text done via Google Docs. Although these are essential three single authored sections reflecting our individual experience and expertise, you can see how significantly we influenced each other’s writing through the comments and revision history on our Google Doc drafts – all of which are available as part of the final product. The conglomeration of these elements – the course site, videos, student projects, drafts, and article text – was reviewed by the issue editors and members of the JITP review board in the same way all other submissions are treated.

Screenshot 2015-01-16 12.50.18

Subsequent to publication, “Digital Literary Pedagogy” was nominated for a Digital Humanities (DH) Award. The nominations and votes are all crowdsourced through social media – meaning anyone can nominate a submission and anyone can vote, but it is anonymous. I do not know who nominated or voted for our article.

Screenshot 2014-12-28 16.00.48

To give you some perspective, in 2012 there were three winners in the category our article was nominated for – which is “Best DH blog post or short publication.” You can see that even in the category title there is ambiguity in the amalgamation of form that for me represents the shift in scholarly production that is happening in the digital humanities. All three of the 2012 winners were very innovate in their form – continuously evolving, interactive, public sites of scholarship. As you can see in the example I provide here, Will Self and his collaborators created an interactive network visualization as the navigation for this webtext, which also includes social media integration and other interesting features that explore the affordances of the digital space.

Screenshot 2015-01-16 12.58.24

In 2013, the year we were nominated, there were far more texts featured on the DH Awards site, and the winners represent a much very different view of digital humanities production. Almost all of these do meet the requirement of being short – they are either blog posts or brief articles in journals, and as is necessitated by the process of the choosing a winner – they are all accessible online. Also, all but two include some kind of multimedia, and some are very mutli-modal such as the “Songs of the Victorians” project which incorporates images, sound, and text in novel ways . However, the winner is a 43 page pdf of a book chapter. It is not interactive or mutlimodal in any way. That isn’t to discount the smart, engaging content, which is well-researched and written. But I do question what its inclusion and eventually winning says about the state of the digital humanities.

Screenshot 2014-12-28 16.02.26

And this brings me to the crux of this presentation. In my research I have found convincing scholarship calling for a revolution in academic publishing dating back to 1996, with a huge spike in the early 2000’s when the “crisis” seemed to peak under economic pressures just as the blogosphere gained momentum. However, despite the many grant-funded investigations that have reached the same conclusions regarding the unsustainable trajectory of scholarly monographs in both book and journal form – we still return to these forms as our primary measure of evaluation in the humanities. Why? Because of the three points made by Risam:

Three principle differences between digital and print scholarship in the humanities require a radical revision to how we review and assess scholarly production and to how scholarly work accrues value: digital scholarship is often collaborative, digital scholarship is rarely finished, and digital scholarship is frequently “public.” (“Rethinking Peer Review in the Age of Digital Humanities,” Roopika Risam.)

It is difficult to evaluate scholarship that is collaborative, public, and perpetually in beta. But I want to take this one step further. What can we, as academics producing digital work, offer that the consumer-driven world of publishing, technology, and new media aren’t? It is difficult to compete with the sleek, user-friendly products made by tech conglomerates – but what they aren’t offering is transparency.

And this brings me back to our article. Roger, Kimon, and I did not include the videos and drafts just for the sake of adding technology. Our intention was to show our process so that other instructors could learn from this experience. This goal extends across the journal – which is particularly evident in our short-form sections – Teaching Fails, Assignments, Tool Tips, and Reviews. These sections are meant to be instructive, show process, and focus on pedagogy. I believe this is what we, in higher education, should aim for in the future of scholarly communication, because now we can achieve this goal at a deeper level. We can make our work open access and open source, allowing audiences to reuse, remix, and rebuild our work for educational purposes. We can engage with process at a meta-level, through text and code.

Just this week, Sarah Thomas, vice president for the Harvard Library, was quoted in Harvard Magazine as saying, “We are still in the Wild West of sorting out how we will communicate our academic developments effectively.”

The digital disruption of the print world is transforming both commercial publishing and scholarly books and journals—and is changing structures for teaching, research, and hiring and promoting professors. Obviously, Roger Whitson assigned that project to his students because he believed it to be a valuable scholarly engagement that would help his students build the skills they need to succeed both within and without the academy. Many of us engage in similar practices in our classrooms everyday. But what are we preparing students for if two decades of discussion, research, and calls for change have yielded such incremental impact on our own methods of reward in the academy? What can we do to make these forms of scholarly production count? Well, as many others have called for – notably John Unsworth – we can pledge to only publish in open access journals, we can work on publications like JITP and Kairos, we can negotiate within our institutions to change hiring and tenure practices, and we can continue to teach multimodal, collaborative composition across the disciplines. But we also need to start talking about all of the issues I mentioned at the start of this presentation. The nitty-gritty details of digital production that need to be addressed and adopted on a large-scale in order to ensure the reliability and longevity of our work.

Screenshot 2014-12-28 14.48.41So let’s chat! Tweet, post comments, email, post on list-servs or facebook! Let’s continue this conversation, and work together to find sustainable solutions.

I want to thank everyone who attended our panel and for the provocative conversation that occurred in the question and answer portion. I’d also like the thank Dawn Childress for her excellent organization and moderation, and Harriet Green and Barbara Chen for their presentations in this panel.

Overall, this was a truly wonderful conference. Vancouver is simply breathtaking, and the location of the convention center allowed us to take full advantage of the natural splendor of British Columbia.

2015-01-12 11.02.26

Also, I really feel that the panels I attended were some of the most inspiring of all five MLA conventions I have attended. I saw a great attention to pedagogy and innovation that excites me and holds tremendous potential for the future. I am also thankful for my friends and mentors who took the time to offer guidance and support at #MLA15. You know who you are, and it means the world to me.

As always, I’d love to hear your thoughts in the comments!


#AAEEBL2014 Presentation

Here is my presentation for the 2014 AAEEBL conference, complete with text. Thank you to those who came, to Macaulay Honors College for their support, and to those reading this for their constructive feedback.


After over a decade of integrating eportfolio technology into the post-secondary classroom, where do we stand? The pedagogical practice of asking students to compose in open, online, multi-user spaces has grown rapidly in recent years. There are a host of advantages that support this practice, including that writing in public venues cultivates digital literacy through broader audience awareness, facilitates interactivity and collaboration among peers, and supports the creation and integration of multimedia artifacts into the writing process. Addressing the lack of systematic study of students’ preparedness to write in online spaces, and evidence that these practices foster the development of long-tail, real-world skills, this presentation will demonstrate qualitative and quantitative approaches to investigating these assertions. Rather than focusing on administrative measures of success, this investigation focuses on the learning process of students.

Full Text (note slide advance prompts are included):

When this call for papers came out, it was if I created it to match my dissertation project, since the research track – “Data-driven Evaluation of ePortfolios in an Age of Increased Accountability” – expresses my topic exactly. In fact, I am going to use the prompts provided by the call to outline my project in order to present my hypothesis, data, methodology, and initial findings.


  1. What are the best practices for evaluating the “value-added” of a particular course or program?
  2. Which types of eportfolios are more successful in measuring learning outcomes?
  3. Effective methods of analysis and evaluation
  4. What do we know so far from research? What are the important questions still ahead?


What are the best practices for evaluating the “value-added” of a particular course or program?

From blog posts, to scholarly journals, and of course the rising interest of popular media outlets, everyone seems to have an opinion of the integration of blogging technology in higher education. Even a cursory Google search produces a host of constituent assertions that support the use of online writing platforms, such as eportfolios, in college-level courses. Claims in favor of this integration include that writing in public venues cultivates digital literacy through broader audience awareness, facilitates interactivity and collaboration between peers, and supports the creation and integration of multimedia artifacts into the writing process. However, most of these assertions are based on anecdotal narratives or survey results that focus on the experience of the faculty and administrators involved.


What I am seeking is evidence derived from the content of the compositions created by students in online, open spaces, and the value of this experience as articulated by the students themselves. Therefore, this project seeks to address the lack of systematic study of students’ writing in online spaces, the multimodal aspects of digital composition, and evidence that these practices foster the development of long-tail,[1] real-world skills.

This study is an attempt to investigate the assertions made about the integration of digital writing in higher education through a combination of qualitative and quantitative research. By applying digital humanities methods and composition theory to almost a decade of student writing produced in an online, open eportfolio system, I will look for evidence of the “value added” through the adoption of a well-supported, cross-curricular implementation of eportfolio technology.


Which types of eportfolios are more successful in measuring learning outcomes?

The research is drawn from a case study of six consecutive years of eportfolios, culled from the Macaulay Honors College (Macaulay), a unique honors program within the City University of New York system that spans eight public university campuses. This group was chosen for study for a number of reasons, involving the particular set of benefits afforded to these students, as well as the demographics of the population itself. Each student is provided with a new laptop computer, dedicated advisors, and full tuition, theoretically eliminating some variables with regard to access and availability of tools and support. The Macaulay student population is notably diverse, consisting of students from a wide range of ethnic, racial, and economic backgrounds, with a significant portion being first-generation college students. These students take the same four seminars in their first two years of the program, and learn the same software in the course of their studies. The program is supported with Instructional Technology Fellows (such as myself) who run workshops, immersion events, and are available for consultation throughout their coursework. Therefore, although the test group is made up of a diverse sample of students, since they are provided with equal – and exemplary – resources in the pursuit of their studies, the Macaulay students represent a strong case study.


Just as this is a particularly strong sample of students to study, this eportfolio system also presents significant advantages. Not all eportfolio platforms are created equal; many proprietary programs are walled gardens available only to those within the university (often only in the class) and do not allow the students to access the backend in order to experiment with functionality and design elements of the site. These skills are in high demand,and selecting a system that bars this level of engagement in digital literacy is a missed opportunity for long-tail education.


The eportfolio system this investigation focuses on is run on WordPress, a blogging platform with BuddyPress, a social networking function, built-in. As of the end of 2013, an estimated 20% of the Internet was built on WordPress, so working on this platform in an educational setting provides students with the opportunity to develop real-world skills.

From a usability standpoint, WYSIWYG blogging platforms are a desirable content management system for use in higher education. With lower barriers to entry, and the familiarity of the basic composing functions, users comfortable with desktop publishing can transition to the online space with minimal instruction. In WordPress, the capabilities of the platform that extend beyond composing deal mostly with the design of the front-end: the choice of theme, the information architecture of the site, and the ability to draw in outside information to be displayed in the widget areas of the site. The skills needed to control these elements can be outsourced to a site administrator, which in the case of Macaulay Honors College is typically a combination of the instructor and the Instructional Technology Fellow. But, both the platform and the mediation by the administrators distance the student from understanding how the technology works, and denies them unimpeded control of their compositions. Essentially these mediators are doing the work that makes online publishing different than composing on paper or in a word processor for the students. This is a missed opportunity currently being addressed by many innovative instructors in higher education.


For example, Karl Stolley has his students compose entirely in code, setting up their own servers and designing their own sites from the metaphorical ground up. As Stolley argued in his keynote address at the 2013 Computers & Writing conference:

 Given the opportunity for extended encounters with difficulty (rather than the software   tools that route around it), digital writers can become specific intellectuals: people whose deep technological expertise rivals that of their command of rhetoric–who are                     therefore able to learn, teach, and build things that scare the living crap out of others.      (

Stolley’s assertion includes two supporting points worth mentioning here: first, that even though all digital writing is difficult, if we can avoid “excessive mediation” (an example of which would be a content management system that does not grant access to the backend, like Blackboard) then we can avoid the second pitfall, which is the need to “keep up” with impediments such as platform upgrades that can delay progress. The remedy for these common pitfalls in digital writing pedagogy for Stolley remains command line level learning.

Although Stolley’s practice represents one extreme, albiet admirable approach, it is representative of a turn away from remediation back toward the fundamentals of computer programming (“In Search of Troublesome Digital Writing: A Meditation on Difficulty”). That the future of rhetoric is an ability to communicate with computers is at the heart of this movement.


Returning to Vygotsky’s theory of cognitive development, I believe a digital writing platform can serve as a “zone of proximal” development for the student of digital writing. In fact, a site created for the purpose of playing and learning is called a “sandbox” by the WordPress community. The middle ground between the command line and the word processor represented by a WordPress Dashboard functions as a learning environment in which student can develop skills that could be applied to more advanced system engineering. At Macaulay the curriculum committee committed to extend programmatic learning objectives beyond this first stage of development by having students create WordPress sites as a class for the final project in the second seminar. Positioning this project in the second semester of their first year gives students a chance to acclimate to the WordPress interface before embarking in the advanced work of designing a site. The website project teaches the students to understand several critical digital literacy skills, encouraging them to see the relationship between the content and design of their site by working through information architecture choices and usability design decisions as a group. This collaborative engagement mimics a professional environment, and allows students with technical aptitude or design proclivities to guide those whose strength may be in research and writing. Since over 20% of the web is built on WordPress[1], knowing the difference between a page and post, widget and plugin, or understanding how to choose and customize a theme, are “real-world” skills attractive to employers. This curriculum-wide scaffolding prepares Macaulay students to create their own personal portfolios as well, an option many students embrace in order to build their online presence in preparation for the next stage of their career. This study aims to explore the connection between the course sites and the personal sites in order to identify the transference of skills from the teacher-directed content to the student-directed content.


By inviting Macaulay students to participate in the design of their course sites, create web-based projects in conjunction with traditional research and fieldwork, and by further encouraging and supporting them in building their own eportfolios that exist outside of their formal coursework, the college aligns their education philosophy with the increasingly wide-spread “Maker” movement happening both within and around the academy. “Critical Making,” as defined by Matt Ratto and Stephen Hockema, “is an elision of two typically disconnected modes of engagement in the world — ‘critical thinking,’ often considered as abstract, explicit, linguistically-based, internal and cognitively individualistic; and ‘making,’ typically understood as material, tacit, embodied, external, and community-oriented” (52)[Ratto, Matt and Megan Boler, eds. DIY Citizenship: Critical Making and Social Media. Cambridge, MA: MIT Press. (In press; forthcoming January 2014)]. The claim then, is that an education based on building and making will lead to a combination of practical proficiencies, experience working with others, and critical thinking skills that can be adapted to meet a variety of complex tasks. It is a future-thinking mode of learning, that positions students not just to accomplish the mission at hand, but be prepared to envision the next problem, and solve it.

As Roger Whitson writes, “making enables us to rethink how a different combination of methods and practices could create different gadgets, experiences, and histories.[…] Making is not simply a way of understanding; it is also an investigation of what could have been” (“Steampunk”).

Hosting the eportfolio system on WordPress not only allows the undergraduate students to take on an active role as makers, it also enabled me, as a graduate student, to access the data they produced and engage in the act of making new as well. Since Macaulay owns and operates these sites on their own server, and because the large majority of these sites are public, I was able to download the content of over 3000 sites in a MySQL database, which can then be manipulated and reformatted in multiple ways.


Why would I want to delve into thousands of lines of messy data? Because I strongly believe that we, as educators, should reclaim ownership of the digital material we produce. Rather than allowing corporations to mine own data in order to sell it back to us through advertising and expensive long-term contracts that lock us into products that limit us in a myriad of ways – we should be building, maintaining, and mining our own sites. This study is grounded in the philosophical belief that if we view writing as data, and use it as such through the use of digital humanities tools (distant reading, data mining, and data visualization), we can improve our pedagogy based on what we extract from this data. This meta-approach to research is experimental and experiential – as Kathleen Fitzpatrick put it, I am “doing the risky thing” by attempting a methodology that requires a high degree of technical difficulty as is still met with skepticism in the humanities. Perhaps, if I can show what is possible to discover when we control our own course management systems and use them for research and assessment, my study can be used by others to argue to the move to open source platforms in higher education.


Effective methods of analysis and evaluation

The data for this research is being collected in three phases, with each phase investigating a stage of development in the learning process of Macaulay students exposed to the eportfolio system as part of their undergraduate coursework. This investigation utilizes both data triangulation, and method triangulation. The first stage is a survey of incoming students to Macaulay Honors College, examining their online reading and writing habits prior to enrollment. The second is a data analysis of student work on teacher-directed course sites (with particular attention on student writing, tagging, and citation practices), and the third, a series of interviews with the winners of an institution-wide eportfolio contest, with in-depth analysis of the corresponding student-directed sites.

The first stage of this study is a voluntary online survey presented to first year students focused on understanding their online reading and writing habits prior to entering MHC. The survey was distributed in Spring 2014, and will be disturbed again to the incoming class in Fall 2014. The purpose of this survey is to assess how prepared the students are to compose in open, online spaces when they begin the honors program by revealing how often and to what extent they have engaged in online writing practices in their personal, professional, and educational lives before entering college. The survey asks the students to identify the sites they use to communicate on the web and their degree of interaction within these online writing spaces; for example whether they used the Internet to gather information, create original material on a website, or to design and manipulate the infrastructure of a website.


*survey findings*


The second stage of this study involves data mining material from six years of teacher-directed eportfolio sites. These course sites consist of content generated from the four seminars required by MHC: Seminar 1, The Arts in New York City; Seminar 2, Science and Technology in New York City; Seminar 3, The Peopling of New York City; and Seminar 4, The Future of New York City. The Associate Dean of Teaching, Learning and Technology, Dr. Joseph Ugoretz, worked with me to extract any of the data that was marked private, therefore providing data from sites on the network that are completely open to the public. This is the stage I am working on now, and this is why I am attending this conference. Thus far, I have been working with my Micki Kaufman, a Digital Fellow at the Graduate Center, to clean this data in usable chunks. First, we selected specific classes that contained robust data, such as content-rich posts. Uses this test cases, we were able to look at relationships represented in the data – such as this chart showing frequency and length of post by each contributor. This graph is done in Excel as a proof of concept, but for large-scale visualizations this work will be done using Gephi and D3. We are also removing extraneous material such as the html markup present in the text so that I can filter it through concordance software such as Voyant.

Much of my work distant reading the sites is informed by a combination of composition and rhetoric and new media theory. I am drawing from process theory and cognitive psychology, as well as case studies that implement scientific methods and the history of portfolio-based instruction, to structure and define my study of student writing in online open spaces.


For instance, I am using case studies that “code” student writing, such as those performed by Janet Emig and Sondra Perl. In “The Composing Processes of Twelfth Graders,” Emig identifies the “two dominant modes of composing” are the reflexive and the extensive:

The reflexive focuses upon the writer’s thoughts and feelings concerning his        experiences; the chief audience is the writer himself; the domain explored is often the affective; the style is tentative, personal, and exploratory. The extensive mode is defined here as the mode that focuses upon the writer’s  conveying a message or a communication to another; the domain explored is usually the cognitive; the style is assured, impersonal and often reportorial. (4)

I wish to make use of Emig’s terms while complicating the idea of public versus private as applied in the digital space. The description of the reflexive mode as “affective” and “exploratory” are hallmarks of many of the assignments found in portfolio based classes, however in the case of online, open eportfolios the intended audience is not the self, but an external audience of their instructors, peers, and anyone searching for the topic on the Internet. Emig defines the reflexive mode as internal, what she calls “self-sponsored” writing, whereas writing in the extensive mode, or “school-sponsored” writing, is intended for an outside audience. However, when considering writing in a public online space this designation falls apart, since the blogosphere invites the writer to mesh both of these modes. The personal, confessional style6 of the blogging genre matches Emig’s description of “reflexive” writing, but the public audience not only suggests, but goes beyond the concept of extensive writing.


Similarly, the digital space complicates the notion – coined by Peter Elbow – of “low” and “high” stakes writing. Now fairly ubiquitous across the education system, low stakes writing assignments tend to be short exploratory exercises that build toward formal “high stakes” assignments, which carry a large percentage of the course grade. On paper, this practice can take many forms, such as in-class free writing, out-of-class journal entries, or ink shedding – a timed pre-writing exercise. This writing can be private or public; the key is that it is not graded or assessed in terms of error. Online, low stakes assignments manifest in familiar forms, such as short reviews and reflections on a course site, but in new formats such as blog posts, discussion forum threads, comments, or as contributions to a social media site (such as twitter, facebook, tumblr, etc.). In the digital realm, low stakes writing is almost always public, even if just to a limited audience. From my work examining student writing in the Macaulay setting, blogging assignments tend toward reflection and self-evaluation. In many of the course sites examined for this study, student posts were meant to express the individual’s response to a text, event, or experience. While some contain the marks of scholarly work, such as research and citations, the content conveys a personal context, rather than an academic one. However, while these short written responses may only be assessed in terms of completion, many would consider the venue to be a high stakes environment. Since the writing is public, and it is often read by their peers, teacher, and potentially the general public, the pressure of an audience raises the stakes.


What do we know so far from research? What are the important questions still ahead?

As Collin Brooke points out in Lingua Fracta: Toward a Rhetoric of New Media, we look to the rhetoric of old media to answer our questions about new media practices. So, when asking about the drawbacks and benefits of blogging, we cannot define the results through the language of old forms, such as the academic essay, journal article, or scholarly monograph. Instead, there should be an effort to examine the new technology in order to determine how its form and function shape the writing process. For that we must look at how people use the technology, and how the technology uses us.

While building on the process theory can provide a context for identifying the stylistic content of student writing on online, open sites, this ignores the other essential element of composition in digital spaces: the space itself. When investigating the writing process in digital spaces, the interface design must also be considered as an active agent. As Doug Rushkoff said in a 2014 presentation at the CUNY Digital Humanities Initiative “I could teach more by through analyzing the design of the Blackboard interface than by teaching with Blackboard.” As research in the area increases it becomes increasingly evident that design mediates our composition process in significant ways that need to be accounted for and articulated. In their 2005 case study presented in “Movement in the Interface,” Synne Skjulstad and Andrew Morrison work through the difficulties of articulating interface design in their attempt to describe the process of building a multimodal site (BallectroWeb). They write:


 Studied in terms of human-computer interaction (HCI), interfaces have been  thought of as intermediary to communication. However, interfaces have   come to be understood as more than a static, graphical layer lying between system and user. They exist as devices for shaping and spatialising the  organization, selection and articulation of what is to be communicated electronically. As a result, interfaces are now an integral and dynamic part of communication design as a whole. (Skjulstad 415)

Drawing from Lev Vygotsky’s concept of the mediating artifact from Activity Theory, Skjulstad and Morrison conclude that the “constructedness” of the interface mediates the content. If taken as true, the content management systems on which eportfolios are built and managed affect the content itself, and therefore no two systems can be taken as equivalent. The decisions made for the writer by the interface design are as important to the final product as the choices made by the authors themselves.


Students compose in WordPress using the “backend” of the platform, an area not visible to a viewer who does not have editing privileges. Known as the “Dashboard,” the control panel for the site obviously resembles a word processor, with icon based action buttons that represent common tasks. The remediation at work in the iconography of word processing programs, such as a floppy disk image for the save function, has been discussed elsewhere but the transference of these structures to the blogging interface carries further implications. Designing the backend of the blog to look like a blank page to be filled with text signals to the composer that words should be the primary mode of creation. Most of the icons offer options to manipulate the text, including font styles, font sizes, and color options, along side functions that directly apply to the delivery of text such as spell check, line spacing, and paragraph formatting. The majority of the elements that encourage the composer to experiment with multimedia also match those found in word processors – such as the ability to add hyperlinks to other webpages, internal bookmarking features, and a WYSIWYG insert media function which uploads images, info-graphics, or videos from files on your computer. All of these align themselves with the word processor rather than with the practice of original bloggers who used the command line to write – requiring code. This move away from composing with mark-up languages such as HTML and style sheets such as CSS is an interesting one, with long-tail benefits and drawbacks.

An amalgamation of classical rhetoric, new media theory, and critical pedagogy, participatory design proponents argue that by developing an understanding of the mode of delivery through which we communicate, we are better able to craft our message and reach our audience(s). In their article “Toward a Public Rhetoric Through Participatory Design: Critical Engagements and Creative Expression in the Neighborhood Networks Project,”
Carl DiSalvo, et al write:


Taken together, critical engagements with technology and the creative expression of issues through technology begin to form a public rhetoric: They constitute the activity of discovering, inventing, and delivering arguments about how we could or should live in the world. The artifacts or systems conceived or created become rhetorical by their persuasive   intentions and capabilities, and by the way they inform and/or provoke a response from or dialogue with others. (48-49)

In its ideal form, an eportfolio system built on an open platform enables learners to make sophisticated design choices; in the process of conceptualizing, implementing, critiquing, and revising the digital space, students develop a deeper comprehension of the relationship between content and delivery. Scholars such as Collin Brooke and Ben McCorkle have already made the connection between design in digital publication and delivery as a canonical rhetorical mode. Both scholars claim the field of writing studies has neglected the rhetorical modes in recent years, and call for a return to theorizing particularly the role of delivery in the age of digital publication. This call is echoed by DiSilvo et al, who argue,

Positioning design as rhetoric does not claim some essential or deterministic quality of technological artifacts or systems. Nor does it suggest that design is fundamentally duplicitous, as contemporary pejorative notions of rhetoric might imply. Rather, positioning design as rhetoric calls attention to the ways in which the built environment reflects and tries to influence values and behavior and explicitly recognizes the capacity of people to design artifacts or systems that promote or thwart certain perspectives and agendas (DiSalvo, 48-49).


Last stage: understanding student-directed design and delivery choices

The third phase involves a detailed study of five or six (based on availability) student-created eportfolios that were selected as Eportfolio Expo[1] winners. The Eportfolio Expo is a self-nominated contest of student-directed sites judged by a panel of professionals who rank the submissions based on preset criteria. In the Fall of this year, I will conduct an interview of each of the student winners and perform a “close reading” of their sites, to assess how the use of eportfolios in their coursework influenced their personal sites.[2] What I am looking for here is evidence of transfer – or the application of skills learned in the classroom to activities performed outside the boundaries of assigned coursework.

Lines of further inquiry:

– Interface design/remediation

– Rhetorical mode of delivery

– Students as Designers

– Re-evaluate collaborative work

– New form of the blog – static vs dynamic

Playing with Data

This will be the first in what I hope to be a series of posts describing how I began dealing with my dissertation data. I am writing these posts in order to help others embark on data-driven projects, as well as to document my process.

It is important to disclose that I am new to working with data in MySQL, and that my dataset of over 3000 course sites would probably be intimidated to even a seasoned expert. But I strongly believe that we should constantly engage in work that challenges our abilities if we want to continue to grow as scholars. My committee members – Matt Gold, Sondra Perl, and David Greetham – are excellent examples of academics who never stop exploring and experimenting, and the results speak for themselves.

To begin, I received my data from Joe Ugoretz, the Associate Dean of Teaching, Learning and Technology at Macaulay Honors College. Joe happens to be an incredible visionary and an amazing mentor to me and the other Instructional Technology Fellows he leads. Joe and I worked with Boone Gorges and John Boy to figure out a way to remove any private information from the SQL dump of data generated from the backend of the Macaulay ePortfolio site. This was an important step, especially in terms of the Internal Review Board, to ensure that any information posted to the course sites that was marked as private remained so in my study. Similarly, all of the participants are identified by node number (based on when they joined the system), not by name.

If you have never seen a MySQL dump from the backend of a WordPress site, trust me when I say that is it extremely interesting, but also very messy. In the table that is generated from just one site you may find over a dozen columns of metadata, and that content may also contain difficult to export html markup. Multiply that by 3000 sites, and I have a lot of material to organize.  In order to deal with this onslaught of data, I am using Sequel Pro (at the suggestion of Joe), which is essentially a dynamic content management system through which I can store, view, and manipulate my data. Note that to use this program I set up my own server using MAMP, which is a good first step for anyone embarking on digital work.

Once in Sequel Pro, the data can be exported in a variety of formats using programs such as Excel or TextWrangler. Micki Kaufman – one of my fellow Provost’s Digital Innovation Grant recipients, who I consider both a friend and a scholar of incredible capability – has been helping me sort through the next steps of this process. First, we met to look through my data and play with small sets in Gephi. In fact, Micki led an informative workshop on Gephi for her position as a Digital Fellow at the GC and used my dataset as an example. In our most recent meeting we began to experiment with answering some basic questions using the data. Our first test was to take one course site that contains posts from over a dozen users to see how frequently each user posted. We exported one set of posts as both a CSV and XML table. It turned out that XML worked better because the content of HTML broke the lines in CSV. In XML table we cleaned up the data by eliminating the html tags and spaces. To do this we used grep codes and tr. Even after cleaning the data we had some ugly rows of corrupted text, however, because of the nature of the content and the low number of the node I was able to identify that these rows were composed by the Instructional Technology Fellow assigned to the course, not one the students, and therefore could be deleted without affecting the results of my inquiry.

Before plugging the numbers into a graph on Excel, we went back to old technology: paper. We sketched out a few potential graphs on paper in order to visualize what we wanted the graph to show us. Micki calls this the “napkin sketch” and reminded me of the importance of taking a step back to avoid huge up front delays trying to clean all of the data when you should be focusing on the data you need to accomplish your goal.

The next steps went as follows:

  1. Converted the table to a tab delimited range of content
  2. Exported the table to Excel
  3. In Excel, turned the html field names into the column titles
  4. Deleted all the unnecessary columns (ie information not relevant to the inquiry
  5. Added a new sheet and created pivot tables of node ID, number of posts, and characters per posts
  6. Then we added a column in sheet 1 to calculate words per post (see formula here:
  7. Added the words per post column to the second sheet
  8. Created a scatter plot chart with the node ID as points, the x – axis as the number of words, and y – axis as number of posts

PostsandwordsThis chart allowed us to visually represent the arc of activity on the course site. As one would expect, there are students who are significantly more prolific than others both in terms of number of posts and words per post, and then there are some that fall short of the median.

One small annoyance is that each node has to be entered separately in the schema, which takes a lot of time. But you can see from this shot of the work-in-progress what this kind of graph looks like using the basic Excel presets. It is also important to realize that we were playing with the data here. This example of graphing is a proof of concept to help me shape my ideas and figure out what is possible – this process won’t scale, so we must find another solution for the larger data set.

Again a huge thank you to Micki for her continued support as I work through the next stages of my research.

The Writing Studies Tree

The Writing Studies Tree (WST, is an online, open-access, crowdsourced database of scholarly relationships within writing studies, composition/rhetoric and related academic fields. Created by Graduate Center students in 2011-2012, the WST combines a fixed data structure with open editing privileges to rapidly aggregate the work of thousands of individuals’ small data entry efforts into scalable network visualizations. Previous academic genealogies have been limited in scalability, access to data entry, or access to data readout; the WST takes a more Web 2.0 approach, involving as many participants as possible, and trusting the community of users to self-regulate. The project thus encourages users to see themselves not only as part of an evolving network of scholars, but also as contributors to the collective knowledge-making project of the field. In May of 2012, the Writing Studies Tree was awarded the Provost’s Digital Innovation Grant at the CUNY Graduate Center. Together with my co-investigators Ben Miller and Jill Belli, we have also applied for an NEH Digital Start-up Grant, and the CCC Research Initiative Grant to continue developing this project.

WST Presentation for DHWI