AI Content ·

How We Used AI to Help Build 1500 Pieces of Content in 90 Days by Ryan Sargent & Megan Skalbeck of Verblio

Bernard Huang

Webinar recorded on

Join our weekly live webinars with the marketing industry’s best and brightest. It’s 100% free. Sign up and attend our next webinar.

Join our newsletter

Get access to trusted SEO education from the industry's best and brightest.

We hosted Ryan Sargent (director of content marketing) and Megan Skalbeck (content marketer) of Verblio.

Ryan and Megan shared how they used AI to help build 1,500 pieces of content without sacrificing ethics.

Ryan and Megan shared it all:

  • Tips for building prompts

  • Their tech stack and process

  • Process of editing AI content

Plus, they go deep into AI detection and the tools they use to detect AI content submitted to Verblio.

Watch the full webinar

Verblio has published a lot of content about their AI content journey on their blog.

About Ryan Sargent:

Ryan Sargent is a director of content marketing, trombonist, and aspiring podcast host. He's obsessed with content - everything from academic writing and avant-garde jazz albums to B2B webinars and Youtube influencers.

Follow Ryan on LinkedIn: https://www.linkedin.com/in/ryan-sargent-5a455511a/

About Megan Skalbeck:

Megan Skalbeck is a former math camp nerd, language aficionado, and recovering freelance writer. Six months ago she was plucked from her role with the Verblio content team to do cool things with AI.

Follow Megan on LinkedIn: https://www.linkedin.com/in/megan-skalbeck-75251466/

About Verblio:

Verblio is the world's friendliest content creation platform. They work with thousands of writers to produce tens of thousands of pieces of content for 400+ agencies and 600+ businesses on a monthly basis.

Read the transcript

Ryan:

We're excited to talk about building content with AI and how it works and all of our highs and lows along the way. As Travis mentioned, Verblio builds a lot of content, a hundred thousand plus pieces in a year, and so we are hard at work trying to figure out how on earth that should go in the world with AI, and this is what we've learned so far.

We'll go really quick through the intro slides because Travis hit the high-up points for us. I am Verblio's director of content marketing. I'm here for your hot takes. I love content. I've done all kinds of content way outside your typical B2B ebook because I'm a recovering jazz musician, so I've mixed and mastered avant-garde jazz albums and submitted papers for peer review journals and done all kinds of weird stuff like that. But I also really love marketing, so that's how I came to be in this space.

Megan Skalbeck:

I am Megan Skalbeck. As Travis mentioned, used to be a freelance writer, also was a kind of left brain math nerd programming intern in a past life as well. And I was doing content marketing. I was a freelance writer for some of our Verblio clients as well as my own private clients for a few years.

Eventually joined our internal marketing team, and about six months ago, Verblio decided that my entire focus of my job now switched to AI and figuring out how we should be using it within the company. And so that's been a fun ride ever since.

Ryan:

Just so that we're really clear right upfront, the goal here is to talk about building AI content at scale, building content with AI kind of onesie, twosie is a totally different animal and there are a lot of workflows that might be a great choice for that, and even tools that are a great choice for building an article or two at a time that we kind of had to immediately discard because we're talking about building lots and lots and lots of content.

At the same time, the goal here is to show you everything we did and what we've learned from it so that you can do it yourself. Full disclosure, yes, we sell this, but that's not what we're talking about in this webinar. We're here to show you what we did so that you can learn from it.

Along with that, I don't think that we're going to convince you of anything with AI today. If you think that AI content is going to make every website have an infinite number of pieces and articles, I probably won't convince you that that's not a good idea. And if you think all AI content is terrible and will totally trash your rankings, I probably won't convince you that it will work.

If I convince you of anything, I hope it's that figuring out the best use case and the best method for hybrid content was a lot of work and we struggled with it, and we think we've ended up in a pretty good place. So without further ado, you probably have a lot of questions. We're going to answer some of these. Can Google Detect AI content? I have an opinion, I don't have hard facts for you on that one. What kind of hybrid content have we been building? We're totally going to talk about that.

The AI state of the union, very timely this time of year. It's not the future, it's already here. You already know this. It is white hot right now. It's everywhere, it's especially all over your LinkedIn feeds. I particularly like this one because here we have someone screenshotting a different person's Twitter thread and putting it on their LinkedIn and getting 1300 engagements.

That's how hot AI is. You can just blatantly plagiarize and get a lot of internet friends. I'm sure everyone here has a Chat GPT account. You can write a haiku. I asked them to make french onion soup. Disagree with that recipe by the way. You cannot caramelize onions in 30 minutes. And we've all heard the doomsday scenarios.

Fun fact, this is an AI generated image and I Googled it. I Googled AI image apocalypse. So if we don't think Google's going to find a way to monetize that use case, you're in the wrong place. We've all seen these doomsday scenarios. This one is particularly horrifying and also a little amusing to me.

This is not the use case that Chat GPT was designed for. Asking one of the most subtle nuanced questions in all of ethical theory, of course, Chat GPT does not get this right. Also, this is just horrifying. The other doomsday scenario we see a lot comes from creators who have a very vested interest in AI content being terrible and they have all kinds of examples of AI content being terrible. And it's true.

Freelancers can provide something that right now AI can't, a human touch, the ability to really deeply understand an audience, to tell a clear narrative across long form content, these are things that AI struggles with that freelance writers do a great job with. And so the examples of AI really struggling are out there. They're everywhere if you look, but that's not necessarily the whole story.

There's of course, also the doomsday scenario of what can happen to your traffic if you publish AI content and Mark Williams Cook, who is an excellent top-notch SEO has gotten a lot of mileage out of the screenshot recently on LinkedIn because it's terrifying. If you publish a ton of AI content, maybe Google gets you and gets you good.

Then of course there's the PR disaster as well. Your reputation is on the line. And if you publish a bunch of AI content and it's terrible, what happens? Even if Google doesn't penalize you, other humans might just decide that you're not so much fun to work with anymore, and that's a major risk as well.

Luckily though, we live in the real world, not the doomsday world, and it's easy to see why AI content is so tempting. If you want to take the high road and say AI content is never going to be good, you might end up out of business while you wait for one of these doomsday scenarios to befall your competitors, because AI is cheap.

It will produce an infinite number of words for virtually no money. It will do it at any time, it will do it at any place. It will never have thoughts, feelings, ambitions, PTO, death in the family. So it's always there for you and it might do what you want it to do. Megan's going to tell you about some things you also probably already know about AI on the technical side.

Megan Skalbeck:

This is just to make sure we're all on the same page for some stuff we'll talk about later, but as you may know, almost all AI content right now is created with GPT 3. GPT 3.5 actually is the latest update from OpenAI that came out at the same time that Chat GPT did. So it's a large language model.

What I love about GPT 3 is all it was trained to do is predict the next word. It wasn't trained to do something specific like previous AI models were like playing chess or folding proteins or doing anything like this. It was just trying to predict the next word. And as a result of that really simplistic training model, it has all of these emergent abilities.

It can write an article, it can brainstorm ideas for tweets, it can classify a statement as positive or negative. It can do all these things that just again emerged out of this being trained to predict the next word. So with that training, it was trained on an enormous dataset of content from the internet, and because of that, it is really good at creating human sounding text.

Again, that's what it was trained to do. It was trained to predict the next word in a way that sounds like something it has been trained on, something it has read before. So that's the very, very high level overview of how that works. Because of how it works, it has some serious limitations.

It cannot have original thoughts because by definition, again, trained to predict the next word based on what it's seen before, by definition it's not going to be producing something unique and original. And it can't fact check itself. Hopefully we are all very aware of this by this point, but it can say things that are blatantly false and it will say them with complete confidence.

So it's sometimes hard to detect when it's saying something false and sometimes it might even say according to the Bureau of Labor Statistics and give you some number that is completely false. Relatedly, it won't be citing its sources. If it does happen to say something true, it's not going to be providing you a link to where it got that from.

And its training model will be cut off in early 2022. So it doesn't do well with talking about super new events or current happenings, anything like that. So those are, I would say the major biggest limitations even aside from getting into things like voice, other things like that, these are the big ones.

Ryan:

And if you've been active on social media, you probably know that that sets up some pretty clear use cases both to use human only content and AI only content. And what I think first, I think it's worth pointing out, the internet was bad at a bunch of stuff in 1993 that it's very good at today.

And I think the same thing's going to happen with AI. There's a bunch of stuff AI is bad at today that one day it will be very good at, and that's worth keeping in mind. The other thing to note here is this list on this slide doesn't represent all content. I would argue it actually doesn't even represent most content on the internet these days.

And what we noticed is it really doesn't represent the content that Verblio tends to build for folks. We work, for example, with local SEO agencies that need lots of content that is fairly generic or say franchised. So that all pretty much lives outside these two boxes, and that means there's an opportunity for hybridization.

You're probably already using AI, you're doing something with it and whatever you're doing with it, I'm wondering if it's on your website or not. So first of our pop quizzes, do you use AI today? I hate webinars where we sit here and chat for 40 minutes and you blaze over or start checking your email. So we're very interested in what you all have been doing with AI so far.

I'm seeing some yeses in the chat as well. So I don't think it's a surprise to anyone that so many of the folks in here are actively using it already, that it's truly blown up. So we are right there with you is the thing. And since Verblio produces so much content, we work with a thousand clients, 400 of those are agencies, we produce tens of thousands of pieces of content every month, our reputation is also at stake.

We have to be producing good stuff, literally what the business rests on. And we're also competing against free, which is wild. This is not my first rodeo at a startup and it is definitely my first rodeo where the thing the startup produces all of a sudden has a competitor that is free. And on top of that, everyone's quality standards are going up because we've all seen what AI can do.

And so that is now table stakes. Full disclosure, I think that's a great thing for the world. There is plenty of bad content on the internet, so if everyone's quality standards are going up, that's just fine with me. We have to figure this thing out. And in my opinion, this is the hot take part, this is where the hype train fails. This is where all of your LinkedIn carousels about how 99% of us are doing something wrong, don't actually deliver any useful value, though none of those carousels talk about how to build thousands of articles in a way that still produces something that's high quality.

They're about using it to generate a better sales email that I'm still not going to read and I'm still going to discard. So how many of the last 10 pieces that you built delivered on those promises of quality? And so we're in the same spot, we want to have our cake and eat it too.

Thank you Google for ruining every great eating pun in every presentation forever. And we said, what are we going to do? We got to work. We had to figure this thing out. And so we decided to give AI the benefit of the doubt and try to use it to build content and see if it was good enough.

Then we said we will try to edit it. What happens if the AI builds version one and a human comes in and fixes it? That didn't necessarily work either. We'll talk about that experiment in a second. Then we started solving for hybrids and that's where Megan is going to show off all of our work there. We had to identify use cases, we had to figure out how to do it at scale, and then we had to talk about AI detection. And that's the last thing we're ... that's why we're going to kind of wrap things up today.

We're going to actually pull some numbers for you and show you what detection tools are saying about our hybrid content. So the first step, are the AI writing tools out there today actually good? And I'm going to turn it over to Megan here.

Megan Skalbeck:

So we did a lot of testing with a lot of different AI tools, more than just the ones on this slide even, and we gave them a very kind of fair shake and were disappointed more by some than by others, but to an unreasonable degree with all of them. And I'm curious, I noticed a few of you dropping this in the chat now, but we did want to ask to learn specifically what tools y'all are using and which ones you've tried.

And similarly to the last question actually, it's fascinating me how much these results would've been different just a few months ago. We'll see here. A few months ago, Jasper was far and the biggest and a lot fewer. I mean, Ryan, we just did a survey of AI with marketers over the winter before Chat GPT, and it was like under half of people that were currently using it. So it's cool to see how those numbers have changed.

Ryan:

And I guess I want to chime in here, we're talking about AI to generate content, write the words, content optimization tools that are based on AI, things like Clearscope, MarketMuse, stuff like that, I think we've all probably been using those for a while. Those are great tools. That's not necessarily what we're talking about here with this presentation today. We're talking about writing the words.

Megan Skalbeck:

Fascinating. So I'm going to share just a couple examples from our testing with these tools. I wanted to write a blog post on content marketing KPIs. So this was an outline suggestion from Copymatic, which is one of those tools. As you can see, this is just a terrible outline. This would not be a valuable article in any way and not to call out Copymatic specifically, this is indicative of what you could get from really any of those tools.

And just to reiterate what I said earlier, most of these tools, again, are built on top of GPT 3, so they're essentially re-skins of the same underlying technology. They'll differ in the exact prompts that they use and some of the fine-tuning, but you're not going to get fundamentally different quality from them. So again, this was not my suggestion, not useful in any way, way too generic, and this is one of my favorite bits of AI generated content and I'm going to read it aloud because it cracks me up every time.

Again, this was an intro paragraph suggestion for a blog post on content marketing KPIs. It's been said and I didn't write it, that content marketing KPIs are the new SEO. If you're not a math wizard, then you'll have to translate that into language you understand. If this is what it takes to be successful in content marketing today, then take it from someone who always gets an F in math.

And you read that paragraph and it's just a prime example of the problems of AI content because all of those phrases on their own sound like something you've read before, they sound like an intro to an article, but taken together, they are completely nonsensical. And that's exactly one of the biggest pitfalls with AI content is that if you aren't reading it closely and if you're just skimming things, if you're producing a lot of articles and you're not taking the time to closely review them, you could think like, oh yeah, this is fine. This is very not fine.

Again, nothing bad about Copy Ai in particular. This is something you could get from any of them, but a great example of the dangers there. Oh, okay. One more tool that I want to call out here because I really appreciate what LongShot AI is doing. LongShot AI is a tool that also offers like Jasper or copy.ai offers, text generation, but they were one of the first to also try to integrate some sort of fact checking, which as I mentioned is one of the biggest pitfalls of AI content.

And I have a lot of respect for them for trying to tackle that problem. Unfortunately, it doesn't work at all yet, so that's unfortunate. You can see here I tried it with what I thought would be a very softball fact to check and then assumed, oh, it'll get this one right and then I'll try it with a more nuanced claim and it will struggle. Well, it struggled with softball.

So you can just see there, I tried testing, there are 62 states in the United States of America. What it should have done is told me that that claim is false. Instead, it provided me with links to sources to presumably cite that fact from. So unfortunate. I'm still going to keep an eye on them to see how they do. Again, I am glad that there are tools out there that are trying to tackle the fact checking problem. Unfortunately, it is just, it's a thorny one. So it will take us a while to get there.

Ryan:

So we just figured out that AI content straight out of the box isn't going to be good enough. It's not going to meet our quality standards. What happens if we try to edit that content? Let's say you want to take the plunge, you want to try these AI tools in the most real world setting possible. So that is the task we assigned Rachel. Rachel's our content marketing manager. She also writes RomCom books for teens and tweens.

Most importantly, Rachel's edited more than six million words in the past two years for local SEO agencies. She knows how to edit an article for an agency. And we said, go hire Jasper to write an article instead of say hiring Verblio or hiring a freelancer and give it a softball, but a realistic softball.

And so she said, tell me about the best bookstores in Charleston, South Carolina. And Jasper said there is a great bookstore in Charleston, South Carolina.

Then it said, here is a great bookstore in San Francisco. Then it said, here is a best place to buy books. And then it made up five places that don't actually exist. So as we already know, out of the box isn't going to work. So Rachel started editing the same way that she would edit if she got something from a freelancer that wasn't up to snuff.

Unfortunately, the green sentence is the part she kept. The rest of it is her own original writing. That's fine. The thing to note here is that the stakes get really high. If you get a mediocre article back from a freelancer and you don't really put your heart and soul into editing it, maybe you're having a bad day, you're still going to have a mediocre article.

If you take the article about local bookstores that has five made up places and you don't put your heart and soul into editing it, you end up with something that is really, really wrong and there's psychological trauma there for the editor that the article, whether or not the article is even worth reading is totally dependent on your line by line fact checking.

So the solution is, well, we can hire all our freelance writers, hire a bunch of extra editors and go from there, it doesn't take into account the psychological trauma of the editing process. The good news is Rachel's edits were still possible and that they only took twice as long. There was still a measurable time here. So while the content was essentially free, it wasn't something that even with editing was going to be a replicable process.

Remember, we're trying to do this for thousands of pieces a month. So that brings us back to wanting to have the best of both worlds. These were the two best pop culture references we could find for the best of both worlds. I'll let you decide which one's me and which one's Megan. And it took a lot of iteration.

Megan Skalbeck:

All right, so this is, again, as I mentioned at the beginning, this has been my job for the last six months and I've been loving every step of it. V1 for us started just with me in the OpenAI playground, manually playing around with prompts, parameters, figuring out how to produce, and how to get a decent article out of it.

And you can see this was actually back before the latest OpenAI update. It was using the older model there, but that was what this looked like at first, figuring out, okay, one person, back to what Ryan said, onesies, twosies, how do you make this work? Then we just kept iterating on that model because obviously one person doing this in the playground isn't sustainable with the size of our clients and the volume of content that we do at Verblio.

So we've been iterating on that using OpenAI's API to kind of build out a process that we can have our network of professional writers be using. This is what that currently looks like. So we actually use an Airtable interface and then there's a whole lot. We've changed the interface for the writers. We've also changed so much on the backend to get better results for different clients, different content types, that sort of thing.

And this is what it looks like currently. This will keep changing. We already had some massive changes to it just in the last couple of weeks and we'll continue to do that. For anyone thinking about experimenting with this on their own, I want to share just some quick learnings that I found to hopefully save you some time.

I mentioned experimenting in the playground first. That is far and away the easiest way to figure out a process, a combination of prompts and parameters that works before building it out in the API. You'll just save yourself a lot of headache there. The text-davinci-003 model is the latest and greatest from OpenAI. Again, that's based on the GPT 3.5 update.

A couple of things about prompt engineering. Providing a sample of what you're looking for is much more effective than describing what you want. So if you give it a sample article or you even just give it maybe the first couple sentences of an introductory paragraph to get it to emulate your voice, that's going to be a lot more effective than just trying to describe your voice to it.

Remember what it was trained to do, it was trained to predict the next word. Make sure you're giving it enough context to do that accurately. And keeping your max tokens low, that is a setting that you can play with in the playground as well as in the API that limits the amount of text that it will generate at one time.

And what we found was that keeping that low kept it from going off the rails because if you let AI go for too long on its own without some human intervention to guide it and steer it's going to devolve pretty quickly whether that's talking about nonsensical things, you'll be more likely to run into spelling and grammar and punctuation issues.

Sometimes it just decides punctuation is not a thing after about 600 words, so I would highly recommend that. And also tiny details really matter, especially when you're working with it at scale through the API. The number of times that our CTO and I were on a call trying to figure out what was going wrong and it was, I kid you not, the case of an extra space in the prompt or an issue with a word being capitalized versus not.

It's funny in retrospect, it was so frustrating at the time to figure this out, especially when you're used to working with code that does the same thing every time. When you're working with large language models that are probabilistic, you're not going to get the same results every time. So even trying to troubleshoot things is a whole nother level of frustration.

I also want to call out getting it to write a certain number of words is something that AI is not particularly good at. And so our solution, what we found really works best for that is to make your outline a lot more granular and include a lot more of the sections than you would if you were giving that same outline to a human writer.

Again, this just goes back to really providing guidance for the AI. So instead of just using a single heading and assuming that the human's going to cover the relevant things under that, give further bullet points to the AI on what it should talk about in each of those sections to help it stay on track and also produce more words on the topics you actually want it to talk about.

And then finally, OpenAI has seen so much more traffic over the last couple of months.

Since Chat GPT, you are a hundred times more likely to run into outages. So if you are building your business on this and are reliant on OpenAI for creating content and being dependable and reliable when you're there, have some safeguards in place. We've been doing a lot of making changes to our process for that specific problem. It's just that you're much more likely to get an error back than you were six months ago when you're working with it.

And so just be aware of that and keep that in mind when you're building content with this. And this is what our flow looks like now. So I showed that interface a couple slides ago, but this is kind of how it works conceptually. We get a content brief from one of our clients that goes both to the AI and to our human writer. Our AI will suggest outline points.

The human will actually curate the final outline, also taking into account preferences from the brief, from the customer, anything like that. The AI will write a section, the human will edit that section. The AI will write the next section, the human will edit that. They go back and forth until the article is done and then the human edits and reviews the final content.

Again, going back to the brief, keeping in mind any SEO considerations or things like that. And if you take nothing else away from this presentation, I want to call out the two most important parts of this are the humans curating the outline. Again, the human needs to be deciding what is going into the article. That is huge, that should not be left up to the robot and then that back and forth.

So if you put in a little bit of human time editing the intro paragraph, rewriting that again, focusing on discussion, voice and tone, you're going to see the effects of that kind of propagate through the rest of the paragraphs that the AI generates and it will save you editing time down the line as well as just producing a better article overall. So those are the, again, two most important things, curating the outline and that back and forth.

Ryan:

And I can vouch for everything Megan just said because I have volunteered to be one of the humans doing this. Like I said, I'm a big content nerd and I think as a director of content marketing, this is my job to figure this stuff out. I have to know how this stuff works.

And being in the weeds with this was fascinating. It was wild. Just bottom line up front, I found that by the end of this thing with some practice, I can edit one of these. I can build one of these about twice as fast as I can build a similar article on my own with no help whatsoever.

So from a results perspective, this thing works. The brief matters as it always does, whether you're freelancing or using AI. The biggest thing I found I was doing in the back and forth section was checking the brief. Did the AI actually address the brief? If it didn't, add a sentence or two from just pure human content to make sure that the brief was met and hope that the AI did a better job in the next paragraph. Those outline points to get 500 words all will probably have a dozen outline points.

I'll have the AI generate close to 700 words and then I'll trim. Other folks on our team have taken the opposite approach. They've gone light on the outline, tried to get the AI to build like 350 words and then add to it. I think it's easier to cut that. That might be a personal preference thing with the editing.

That back and forth Megan mentioned is so important because the AI very much takes that into account. If, for example, one of the clients we built this stuff for really wants short sentences, doesn't ever want long sentences in their content, the AI has a tendency to run on sentences. So by clipping those and breaking them up in the first two paragraphs, by the time the AI is generating paragraphs three, four, and five, it writes in shorter sentences.

It is smart enough to do that, but it wouldn't have done that if I didn't go back and edit it paragraph by paragraph as we were building it. So those editing learnings look a lot like this and generating multiple outlines is where this starts. So our system builds two, has the AI generate two outlines. I've done dozens of these now. I can't think of a single time when I used bullets from only one of the pre-generated outlines.

I also can't think of a time where I didn't come up with bullets on my own. So I'll pick a few bullets from one of the AI suggested outlines, a few bullets from the other and then add a few of my own and then have each of those represent a paragraph that GPT is going to generate for me.

And then as I mentioned, build more than you need. The other editing piece, I often found myself synthesizing paragraphs. The AI has a tendency to think that a sentence and a half is worthy of its own paragraph, but if I take two or three of those because I've generated more outline points than I need and combine them maybe with a sentence or two of my own in the middle, I'll end up with a paragraph that sounds pretty good.

And then fact checking, this is part of the editing process when you're working with AI generated content, it has to be, and I Googled that's exactly what a freelance writer would do. So no shockers or surprises there. That got us into a place where we needed to scale this thing and figure out how to use it to build a lot of content.

Megan Skalbeck:

Also, what's funny right, is I realized this morning that this slide will change for us literally week by week as we figure out more use cases with more clients here. So the initial problems that we were tackling were between four and a thousand words per client, and we were focusing a lot on kind of local and franchised content across a ton of verticals.

So originally our initial process worked, it was pretty industry agnostic, but did really focus on shorter form content and that more local and franchise. We are already expanding the use cases from this slide, so that's exciting. We're going to be working on doing some 1500 word posts, 1500 word articles soon. We're going to be expanding into some more online publisher content and figuring that out.

And what's been really cool is like every time I'm on a call with a new prospective client for this, there's usually some new wrinkle that we need to figure out whether it's in how they want their content formatted or what their industry is or something like that. And it's super fun for me to figure out, okay, how do we need to tweak this pretty general purpose process to meet this specific need?

So continuing to iterate on that and then often learning things that we can apply across the other verticals and/or with our human only content as well.

Ryan:

Yeah, that's definitely worth a second call out. This is all happening alongside the human content that we're continuing to build. So we already know how to build human only content at this scale and that process, that whole product line is continuing.

The other thing I'd call out here is if you have one takeaway from our hybrid process it's, we didn't take an AI and use it to help make a human only content faster, and we didn't take AI content and try to fix it at the end. This is totally intertwined and through all of our experimentation, that is the thing we just keep coming back to over and over and over again.

You get the most bang for your buck when you go back and forth. So if there's one piece of the AI assisted world that you take away, I'd say it should be that.

Travis:

And a quick interjection, Ryan and Megan, we have a question, I think he's kind of popping up as some clarification around what you mean by editing paragraph by paragraph.

Megan Skalbeck:

So what I alluded to earlier when I mentioned only having the AI generate a bit at a time and not letting it write a full article at once, how we build our articles with this process is we feed it, we essentially feed the AI kind of one outline heading at a time. And so what it does is it will write the intro and it will return that to the human writer.

The human writer will edit that intro and then add the next heading and the AI will then write that next section, return it to the human, the human will edit that next section and then continue on through the article. So again, as Ryan mentioned, if you make edits for shorter sentences or something like that, or again to change the voice early on, the AI will then have those changes to use as it's writing those later paragraphs.

But that's how we're building this, is kind of one basically outline item at a time. Does that clarify that? Hopefully.

Ryan:

I can give you an example. So buyers beware how to avoid window replacements scams. I was the editor on that piece. So I had it write an introduction about window replacement scams. Then I edited it because the intro was pretty bland. Then I had it write a paragraph about what are window replacements scams.

Like many millennials, I continue to rent a townhouse instead of owning a home. So I don't know what a window replacement scam is, but like a freelance writer, I needed to find out. So I had to write that paragraph. I went and checked, then I went and fixed that paragraph. Then I had it write about tactics in window replacement scams.

So I had it write a paragraph about high pressure sales tactics. Then I edited that. Then I had it read a paragraph about, oh, fake short-term warranties that are another form of window replacement scam. So literally back and forth one paragraph at a time, editing at each step.

But when you're only editing a hundred words at a time as an experienced writer, that was pretty quick. I didn't agonize over those hundred words for 10 minutes. I spent 90 seconds and was like, great. Fix that. Move that, done, generate the next thing.

Megan Skalbeck:

And to Matt's question, just to clarify, yes. So we're sending back the full edited article as the prompt for the AI to generate the next piece.

Ryan:

As you can see-

Travis:

Thanks for that.

Ryan:

As you can see these other titles, this is pretty standard content. These look like articles that appear on the internet and they don't look like thought leadership and they don't look like something that a content mill is going to spit out with any kind of quality.

So producing these kinds of things at quality is our bread and butter, and it's something that has really worked well with this hybrid approach. So before we run out of time, results. This is where I wish I had 40 slides of beautiful Google Analytics screenshots to show you.

We don't. In part that's because the content's very new, that piece that was on the last slide about window replacement scams I wrote two weeks ago, it hasn't had a chance to rank yet. The other piece of that is we don't have access to our client's clients search console and analytics data. That's just not something that we have access to.

It's something that we're working into some agreements. So my hope is that one day we get to share those graphs with you, but not something we have today. Megan, what else is there to say on that?

Megan Skalbeck:

I mean, we don't have the results for our human only content either. Again, for the reasons you said, we work with agencies, we don't have access to their client's data, unfortunately.

Ryan:

We did learn a lot though, and some of what we learned had to do with the team. And so Megan, I'll let you cover this stuff.

Megan Skalbeck:

And with one of our first beta clients for this, they actually worked with us on this project because we provided them samples through this process and it was better than what they were currently getting from their human only content vendors.

So that was the first major qualitative win for this was like, okay, we are producing something better than what people might be getting from other vendors or freelancers. At this point, we have a few dozen of our writers and editors trained on this process, and that is significant because this is different from a normal writing process.

Even already among freelance writers, they recognize the difference between being a writer and being an editor. And this requires some very specific editing with attention to a few key really important things like fact checking, ensuring that the article actually makes sense logically, things like that.

So at this point, we've got quite a few training on that and we're adding to that pool every week as we scale this up. On the efficiency side, as Ryan mentioned earlier, the average article takes about twice as fast through this process. The other exciting piece for us and for anyone who's ever worked with freelancers is we've had every single piece completed on time, which again, we're doing hundreds of pieces every week just for these few clients right now.

So that's a big win for us. And that last bullet point, I want to explain what we mean there. Our writers have reported that instead of spending the majority of their time simply getting words on a page, they're actually spending more time on really improving the quality of that piece.

So they're able, because they're not just spending their time on some of those more rote pieces of, again, getting the basics, the bare bones of an article, they're actually able to put more time into doing things that only a human could do. Maybe that's really making sure the EEAT is present and accounted for, making sure the voice is what the client wants, those things, they're able to put more time into that because they're spending less time on the rote pieces of the work.

Ryan:

In my case, that's deleting Oxford commas because we have a client that doesn't want Oxford and I love Oxford comma. So what did we get out of this experiment of producing thousands of blogs here? Everything we do for the human only content, we also do for this stuff, the fact checking, the infusing voice, the paying attention to briefs and style guides even when they don't include Oxford commas. The speedy delivery piece, this is all Megan, but this required a lot of custom workflows.

Megan Skalbeck:

And our advantage here was that we had already built out a lot of these, again for our human content. So we were working with, every client has different preferences on how they want their content delivered. We work with different writer pools for all of these different clients. And so figuring that out on the backend, automating that as much as possible, being able to send content directly to our client's CMS, and also having our team that's again, just used to working with content at this volume.

So managing things like, managing writers, managing deadlines, managing the briefs, all of that sort of thing was a really important part when you're doing it at this level of scale. And then that final bullet point, as I already mentioned, training up our writers to specifically be working with the AI and recognizing the ways in which they need to both mitigate the AI's weaknesses and the ways in which they can add the most value to a piece of content. So that's been a fun nuanced thing to explore.

Ryan:

It's amazing what a hybrid job that is. It's very meta, it's a hybrid thing for hybrid content, but it is both writing and editing. Let's talk about detecting AI content and AI detection tools. Here, we do have some cool graphs to show you. So here, Megan, we're just going to talk about our tech stack and then I'll talk about the graphs.

Megan Skalbeck:

So we are currently using Originality.AI, and we actually are running this on every single submission that goes through our platform, both from our hybrid content and from our human only content. We've tried a lot of different tools. Originality.AI is the one that we found that works the best.

Most of these AI detection tools will return a percentage value, which just to be clear, is not the percentage of content that they think was generated by AI in an article, it is the percentage likelihood of that article being produced by AI. So if you see 50%, it doesn't mean half of it was generated by a no, no, that's not what that means.

OpenAI does have a new tool. They don't have an API for that one yet, which is why we're not able to use that one at scale. Again, we're doing this at massive volume, but Originality.AI Is the best one we've found. And the important thing here is that none of these tools are perfect by any means. And so when you're running them on just one or two pieces, the results are not going to be that significant.

Fortunately, because of our marketplace, we can run thousands of tests. We can also run hundreds of tests per writer and see trends across that, right? Because you could have a whole human only piece score very highly through one of these tools. You could also have a pure AI piece scored low. So being able to do it at volume and identifying trends that way is how we're able to actually make those results valuable. And also, again, running them across clients in that too.

Ryan:

And to some extent, it's subjective. If you get a 50% score, it means that this tool thinks it's a coin flip. That does lead to some false positives along the way, just as with plagiarism checkers in the same way. So fun graphs. Content that is a hundred percent produced by AI tends to score 90 plus.

So we've set 80% as our threshold. And Megan, you can talk about what happens when we put the HCAI content through this checking process.

Megan Skalbeck:

Yeah, absolutely. So our initial HCAI content, so I mentioned our process for this has changed, we used to not have quite so much the back and forth between the human and the AI. And initially just through our prompt engineering and how we set up the model, we were able to get around a 74% likelihood of AI just with a first draft before the human reviewed it.

After the human reviewed it, and with our current process now where there is that back and forth, so there's more human being added throughout the entire process, we are averaging around 34%. And so again, these are averages on an individual piece. They can vary as we mentioned, false positives and that, but we're pretty stoked about that and also working to keep seeing how we can continue bringing that down.

Ryan:

I want to be really clear about these next two graphs. So we took all of the content that gets submitted to Verblio's platform in the first week of January for each of these years. So one 50th of our total content, still thousands of pieces, and asked originality AI to show us the median AI score for these.

So again, this is pretty low. In 2019, not many people are using AI to build content. And again, a median score of 4% means yes, half the scores were above 4%, but a 10% likelihood that it was generated by AI almost certainly means the piece was not generated by AI. So what this graph is saying is of this content we received in the first week of January, 2019, virtually none of it, likely none of it was AI.

Enter Chat GPT November, 2022 and you can see that there are a lot of writers, unscrupulous folks who are trying to submit AI generated content to our marketplace. Those people are all banned because we caught them and we immediately disabled their accounts. So this is the kind of work that AI detection is doing in our workflow.

And again, even on this 2023 data point, it's almost all tied to a very small group of writers who are submitting only AI content with no editing. Again, we banned them. This is showing the same thing. So this is the number of posts greater than 80% detection score. You can see that there are false positives.

I don't think more people are using AI to generate content and try to slip it past us in 2020 than in 2021. That wouldn't make sense. And these numbers are still quite low overall, and this is why we run through the checker. And I think this is more and more important for people who are purchasing content no matter where they're purchasing it from.

There is a real likelihood, or maybe not likelihood, but possibility that what you're getting might be generated by AI now, given the ubiquity of these tools. Our last slide before we sum things up and answer some questions, we did want to try OpenAI's new tool. OpenAI's leading the way on some of this stuff.

And while it doesn't have an API, so we can't run thousands of pieces, we ran three dozen. We took three dozen pieces that Originality told us were likely AI, a dozen it wasn't sure about and a dozen it said we're probably not. We put them through OpenAI's classifier. Only one instance of Originality saying probably AI and OpenAI saying probably not.

A bunch of instances of OpenAI saying maybe this is AI generated content and Originality saying, oh no, this is probably AI content. So again, this is only 30 something pieces, but based on this small sample size, we think that probably Originality might be checking a little harder or coming up with ... Airing on the side of being harsher, I'm already into some hot takes that Megan is like, I don't know if you can say that.

Megan Skalbeck:

I would say what I appreciate about OpenAI is they are very clear with their tool and the limitations, and I think they are being overly cautious on that front, which is also I think probably why they haven't released an API yet for it. They've said it like this is to spark conversation around the providence of content and all of that, which is interesting.

But yeah, I do appreciate the rigor of Originality.AI on these just because it then gives us the tools to do those manual reviews because that is what it takes in all these cases as well, taking the score. But if a human reviews that, they can also add in their judgment on where that content likely is coming from.

Ryan:

So the obligatory summary slide, AI only content has costs. It looks very, very cheap. But everything from traumatizing your editors to extra time to downtime with the OpenAI system, it's not as cheap as it looks. And hybrid content can be the best of both worlds, but you've got to put humans in throughout the process.

It has to be a true back and forth, true marriage. You can't just edit the thing at the end and expect to get what you're after. So far, we've found that local franchised content is the best stuff under a thousand words, but that almost any industry can find success with hybrid content.

As Megan said, we're testing new use cases all the time. Legal stuff is one of the things next on our list. So the hope is that this list gets a lot longer, pretty quick. And you've got to have custom workflows to make this thing work at scale. That's the 10 second summary. Megan, what did I leave out before we take questions?

Megan Skalbeck:

I think you nailed it. We're good for questions.

Ryan:

Awesome.

Travis:

Awesome guys. So the first question is how do you handle the need for subject matter experts to know what you're talking about in the outline and how you talk about it?

Megan Skalbeck:

So is this something we kind of had already solved with our human only content, which is just we have a very large marketplace of writers with industry expertise in dozens of just for different industries. And so we're able to pair our clients with the writers that are familiar with their subject matter.

Travis:

Awesome. Cool. And then we've got another question. Carrie kind of asked what source online becomes the source of truth for fact checking citations if so much content becomes AI generated?

Ryan:

That's a great question. I think that in the long term, that's going to become a bigger problem. For now, when I'm editing this stuff, I'm looking at the same sources I would look at if I wanted this information myself. Actually, in one of the pieces I edited, I added a link to the Mayo Clinic. I was like, that is a trusted source and relevant.

Travis:

Cool. And then another question from Jeanette, how do you get around that AI can't cite sources? I'm assuming you just add them at the end after the content's already been completed.

Megan Skalbeck:

Yep. That's in the human set because again, a human needs to be checking any claims that the AI is making anyway. And so as they're checking those claims, they're able to add in sources for referencing those claims. But again, that is definitely a purely human piece at this point. Would love it if we could figure out some way to automate some of that work, but at this point it's on the human checklist.

Travis:

Awesome. And one of the questions just came in about the tweet that showed the screenshot of the Google Search Console dashboard where they just had a significant drop in traffic. They're kind of asking, what do you think happened?

Ryan:

So my understanding with that screenshot is that Mark Williams Cook was working with a client who published 10000 pieces of AI content and within a couple of weeks everything crashed to zero traffic. Like Google clearly penalized that website. I would honestly recommend connecting with him on LinkedIn and asking more about that specific use case because he's great at this stuff.

And I know he's also cited examples where he's published large numbers of pages in the past and not been penalized, which is what led him to believe that the AI content was the culprit there. If he as an SEO did the same process and the only variable was who wrote the contents, then that makes sense.

Travis:

Awesome. And this is actually the last question, another question from Jeanette. She asked what can you tell about the content that is being detected as potential AI? I think she's kind of talking about the writers that submitted content to Verblio and you removed them from the platform, but was there something that could have stood out or was it just you weren't asking for AI content and they submitted AI content?

Megan Skalbeck:

For us, it's important that if our customers are paying for human only content, that's what they're getting. Our hybrid solution we offer, it is cheaper than our human only content because it doesn't take as much time to produce. So for us, it's just a matter of, I mean, honestly, reputation and brand risk and quality there of ensuring that, again, our customers trust us to provide human only content.

And we are very clear with our writers as well that that's what we expect simply because in most cases, if they're submitting AI content, it simply isn't meeting our quality standards. It's not necessarily about the AI per se, but we've seen that when they're generating AI content on their own through Jasper or another tool and not going through the process that we've kind of built out, that we've identified as actually creates quality content, if they're not doing that and they're just using these tools on their own, often the quality just isn't there, regardless of how it was produced.

Travis:

Cool, helpful. And then this actually might be the last question. Vera sent in, what's a typical turnaround time for this type of content? I know, Ryan, you said it cuts your kind of delivery time in half, but as far as how many hours specifically do you kind of see saving from using AI content?

Ryan:

I saw this question in the chat too. So a 500 word post about window replacement scams for me was 45 minutes to an hour before, and now it's 25 to 30 minutes, which is pretty quick. It helps that the brief is systematized. And I saw that question in the chat too. Megan, can you say a couple of things about how we're structuring the brief for the AI?

Megan Skalbeck:

Yes. So well, currently we're getting our briefs from our clients, and those vary, right? We haven't standardized those. We haven't forced our clients' briefs into any set format. What we're providing the AI is we're providing a few key pieces of information about the business, and we're providing any sort of guidance that they have on the outline or that, but a lot of the stuff, a lot of the details in the brief will be up to the human writer to incorporate.

So if there are SEO preferences, if there's other formatting preferences, if there's things, we've had clients who say, okay, I want to article on this, but don't mention this, maybe in the case of a roofing contractor, they don't want to mention a certain type of roof that they don't provide, you can't tell the AI not to talk about specifically like that.

So that's where the human comes in. So a lot of the brief is up to the human. There are some just key pieces of standard pieces of information we pull out of that around business, business type, geographic location, things like that. For the most part, we just take what the client gives us and give it to the writer.

Ryan:

And so curating that outline is part of my 30 minutes. That is the thing that I've gotten fastest at. The editing kind of is the editing, but I'm much faster at picking and choosing outline points than I was the first article I did.

Travis:

Wow. That's awesome. Well, thanks everybody for taking time out of your day to join us, and definitely big thanks to Ryan and Megan for delivering such an insightful webinar. Please give them a shout on LinkedIn. I'll drop their profiles in chat. But Ryan, Megan, do you have anything else to add before we give everyone their time back?

Ryan:

Just this was a blast. Thanks everybody for showing up for listening, and by all means, check out our other AI related experiments and connect on LinkedIn. We'd love to chat more about it.


Written by
Bernard Huang
Co-founder of Clearscope
©2024 Mushi Labs. All rights reserved.
Terms of service, Privacy policy