Writing with AI: a Check-In
Is generative AI any better at producing good, interesting writing?
It’s been a while since I dedicated a post to an in-the-weeds, deep-dive exploration of generative AI tools. The emergence of NotebookLM, Google’s Gemini-powered “research assistant,” and ChatGPT’s writing assistant “Canvas,” inspired me to spend an afternoon trying to deeply collaborate with generative AI on writing.
The question I posed to myself: Given all the improvements to generative AI over the last two years, has it become a helpful, true partner at all stages of the writing process? The challenge I gave myself: Use generative AI to brainstorm, plan, organize, write, and edit a new “Learning on Purpose” post with as little intervention from me (aside from prompting) as possible.
I began by opening three tabs in my browser: 1) NotebookLM, which is free with a Google account if you are over 18 years of age (although not with your school Google account unless approved by your school), 2) In ChatGPT, I used GPT’s and Canvas, both of which I can use with my ChatGPT+ subscription, and 3) Claude Projects, which is Anthropic’s collaborative workspace that I can use with my Claude Pro subscription.
I chose these three tools because you can upload large amounts of content (files, text, media, etc.) that the bots then use as a knowledge base for generating responses and artifacts.
I uploaded to all three tools every “Learning on Purpose” post from 2024 so far (13 articles, about 25,000 words in total) and seven slidedecks from recent workshops I have facilitated (five about AI, two about learner-centered assessment). NotebookLM connects to your Google Drive, making it easy to pull in docs, slides, sheets, etc. In GPTs and Claude, I uploaded all the content as pdfs.
Then, I did what I often do when using generative AI: I gave each bot the exact same prompts in order to compare outputs and to be able to pick and choose the material most useful to me.
Exploring the Knowledge Base
Prompts
I am Eric Hudson, the author of all of these posts. What would you say are the five AI topics I refer to the most?
Which examples do I use the most?
Provide me with constructive feedback on my content: What are strengths and areas for growth?
What feedback do you have on the style and presentation of my work?
Can you find differences in the content of the slidedecks vs. the content of the articles?
Whenever I upload content into generative AI, my first step is to check to see if the bot is actually drawing from that content in its responses. All three tools did a good job on all five prompts, with minor differences in their responses. For example, ChatGPT said a major theme in my work is the concept of a “human in the loop,” something I’ve written about just once, while NotebookLM and Claude didn’t mention it at all.
All three tools had similar feedback for me. According to generative AI, I write great open-ended questions, but I don’t offer in-depth case studies. I provide concrete examples, but they don’t represent a global perspective on the topic. I agreed with both of these assessments and a lot of the other generated feedback.
The most surprising part of this stage was that all three bots made accurate, useful distinctions between what I cover in articles and what I cover in presentations. I thought asking it to distinguish between formats would spark errors, but its output was clear and helpful. While I didn’t always agree with AI’s “point of view” on my work, I did not find a single hallucination or error in the responses to these five prompts.
One clear and, I think, important advantage that NotebookLM has over ChatGPT and Claude: it links to the knowledge base. Its responses have numbered citations, and when you click on them, the cited source appears on the left side of the screen and directs you to the segment from which NotebookLM drew its idea. In my experience, it did this beautifully and seamlessly with both pdf files and Google Slides (I added the arrow on the below screenshot for clarity).
In this experiment, the knowledge base is composed of content I created, so it’s easy for me to assess the accuracy and usefulness of the output across all three AI tools. You can imagine, however, that NotebookLM’s citations could be very helpful when the content is new to the user.
Brainstorming and Researching a New Article
Prompts
What is an idea in the slidedecks that I could turn into an article in the same vein as my existing articles?
What research can you share with me that supports this article idea?
Can you curate external resources to support this article?
I've just added some research papers on the topic of student agency and AI. Did you notice what I added?
In looking at these papers, how do they support or weaken an article about AI and student agency?
All three bots accurately identified topics that I emphasized in presentations but had not yet written about in articles. In particular, both NotebookLM and Claude suggested some specific ideas from my decks on learner-centered assessment that I could apply to AI (I’m actually embarrassed I had not thought of these myself).
All three bots agreed that I could write more about student agency and AI. Claude also suggested I write about strategic foresight and AI. NotebookLM suggested I write about AI’s impact on equity and access. ChatGPT suggested I write about multimodality. This is a good example of why I use multiple bots: they are different enough that you can uncover new output with each one.
In order to be able to keep working in all three bots on the same topic, I decided to focus on student agency. When we got to the research-based prompts, however, my smooth ride with AI came to an end.
When I initially asked for research, all three bots offered suggestions or summaries without references or links to external sources. It was only after I explicitly asked for external resources that the bots provided me with titles, dates, and links. A quick Google search revealed that most of them did not exist. ChatGPT was the only one of the three where the majority of its suggested resources were real; all of NotebookLM’s suggestions were fabricated and all but one of Claude’s (of five) were.
Since my three main bots failed me, I used Perplexity’s academic focus filter and the research tool Elicit to locate research articles related to student agency and AI. I downloaded five of the suggested articles as pdfs and added them to the knowledge base in all three bots. All three did a good job incorporating this new information into the knowledge base, explaining clearly (with specific references) how these articles could be used to both support and weaken an argument about student agency and AI.
Composing a New Article
Prompts
Let's say I wanted to craft a paragraph that combined ideas from one or more of my articles, one or more of my slidedecks, and one or more of the research articles. What could that look like?
Could you incorporate direct quotes with attribution into this paragraph?
How would you revise this to reflect Eric Hudson's writing style and approach?
OK, use this paragraph to write an entire article that fits with all of my previous articles. It should not be longer than 1000 words.
I was most skeptical about AI’s ability to ace this stage of the writing process.
All three bots were successful at composing a clear, cogent paragraph that drew ideas from my articles, my slides, and the academic papers I uploaded to the knowledge base. The writing and argument were typical of generative AI: polished but generic, logical but unimaginative. On first pass, none of the bots included direct quotes or citations from sources, and when I asked them to add both, some of the citations were fabricated, some quotes from my materials were attributed to the authors of the research papers and vice versa, and very few of the quotes added meaningfully to the article. I found it impossible to correct these errors via additional prompting.
ChatGPT’s new Canvas feature was the only tool among the three bots that allowed me to line-edit text. When you ask ChatGPT to open the Canvas, your text appears on the right, and you can directly edit and make marginal comments on it the way you might a Google or Word document, and you can highlight specific text and ask ChatGPT about it or ask it to rewrite it. On the left, you can talk to ChatGPT about the document. Below is a short video that starts with me uploading more of my decks and asking ChatGPT to use ideas from the decks in the AI-generated article. You can then see me use ChatGPT inside the text to make edits and comments.
This is similar to what you can do with Copilot in Office365 (and, I’m assuming, what you will eventually be able to do with Gemini in Google Drive).
NotebookLM also had some interesting things to say, but I can’t share them here because NotebookLM doesn’t automatically save chats. If you navigate away from your NotebookLM chat without clicking “Save to note,” your chat vanishes. As a person who went to high school in the 1990’s, I found this experience triggered many memories of failing to save my essays in WordPerfect before letting one of my siblings use the family computer.
Generative AI is a Powerful Assistant and a Poor Replacement
When it comes to writing with AI, we are way past the blunt-force prompting, “write me an essay” phase of the technology. The accuracy and nuance of all the major bots has improved, and tools like NotebookLM and ChatGPT’s Canvas make generating, developing, and fine-tuning both ideas and writing easier than ever.
As a former English teacher, I think about the core writing and thinking skills of 1) selecting, integrating, and analyzing textual evidence in a meaningful way, 2) having a good idea (and knowing when it’s good) and composing an interesting expression of that idea, and 3) sustaining an argument over multiple paragraphs in a way that reflects a distinct point of view. I don’t trust generative AI to do any of these things on its own, but if I were in the classroom now, I’d be showing my students this experiment and having an open, honest conversation about what it means to think for ourselves and express ourselves, and what role AI should or shouldn’t play in those processes.
If you’re not aware of NotebookLM’s “audio overview” feature, with the click of a button you can create an AI-generated podcast episode based on content you upload to the knowledge base. Below is the audio overview of the article you just read. You’ll hear the two “hosts” work through this post, quoting directly from the text and commenting on it. It is a powerful example of generative AI’s multimodal capabilities.
Upcoming Ways to Connect with Me
Speaking, Facilitation, and Consultation
If you want to learn more about my work with schools and nonprofits, reach out for a conversation at eric@erichudson.co or take a look at my website. I’d love to hear about what you’re working on.
In-Person Events
November 20: I’ll be co-facilitating “Educational Leadership in the Age of AI” with Christina Lewellen and Shandor Simon at the New England Association of Schools and Colleges (NEASC) annual conference in Boston, MA, USA.
Online Workshops
October 15: In partnership with the California Teachers Development Collaborative (CATDC), I’m facilitating “Making Sense of AI,” an introductory AI workshop for educators.
October 30: I’ll be a part of a free online event hosted by Toddle, “Selecting and Evaluating AI Tools.” I’ll be sharing ideas for how to make mission-aligned, student-centered decisions about whether and how to adopt AI tools for learning.
October 31: Also with CATDC, I’m launching a four-part online series called “Deepening our AI Practice” for educators who already have a working knowledge of generative AI.
Links!
If you’re interested in NotebookLM, I recommend this interview on the “Hard Fork” podcast with Steven Johnson, who helped create the tool.
A very good overview of Common Sense Media’s recent study of teen use of AI. The bottom line: young people are using AI for and at school, and guidance from parents and school has an impact on their behavior. Unfortunately, not all of us are offering that guidance.
“I don’t think I keep hitting these walls with AI because I’m not doing good prompt engineering… I think I keep hitting these walls because of who I am, where I’m from, and how I view the world.” Maha Bali on AI’s “cultural hallucination bias”.
Ethan Mollick with excellent advice on how to take the “crowd” and “lab” approach to building a culture of AI curiosity in your organization.
“The people who teach core content to our nation’s kids and the people who coach teachers on the use of technology are not on the same path. They aren’t even on parallel paths.” Dan Meyer on the AI “disconnect,” an idea that resonates with my own experience visiting schools.
A running list of dozens of articles, academic papers, and other resources on the environmental impact of generative AI.
Yet another study shows that AI detectors are both unreliable and easily fooled.
Some of this reminded me of young writer I met recently who suggested a method for using AI to improve creative writing that he calls “adversarial creativity.” The basic idea is that a writer generates original work, feeds it to AI, and then asks the AI to generate something similar in tone, perspective, etc. If there is overlap between the two, the writer must rewrite their own work and omit the lame ideas that the AI generated. Each revision of the writer’s work can become more original by removing generic ideas. Much longer explanation here:
https://wgcr.substack.com/p/thesis
A great model-by-model analysis. Things get really interesting near the end. Students react in really interesting ways to AI generated content.