Coffee and Chat

A blog (mostly) about coding

Making This Blog! (part 1 backend)

So one thing that a lot of coders do is write a blog. While I’m not normally a trend chaser, I do see the benefit of doing something like a blog to show your inner thoughts on coding and the like.

Thus I decided to try my hand at blog writing. But as I wanted to flex my coding skills I also decided to try my hand at coding a blog, and this gave me an article to write!

The Blogosphere

The blogosphere, like many internet phenomenons, has exploded in popularity over the last decade, and along with that has come in-depth and sophisticated tool sets developed to simplify the process for end users so that they don’t need specific programming skills to produce blogs.

Software solutions like WordPress, Blogger, Tumblr and Medium provide easy platforms to produce a blog, but as I am a software developer, I decided to use a different base for my blog site, one that had API tools associated with it, but not a defined blogging system, allowing me to develop a front end, and CMS from the ground up to both suit my needs and fulfil my inventive curiosity on how and what to do to build a project like this.

So I decided to make my blog using Google Drive and Docs.

Googles suite is easy to use and powerful

Project planning

First and foremost - why not use a blog or content management service? While use of a CMS like Strapi, WordPress, Cockpit could streamline my design process, I had two reasons for using Google Drive as my database.

Google Docs is a powerful word processing package I am highly familiar with, and there is a robust set of API tools accessible in the form of Google APIs

When looked at from a technical standpoint, Google Docs is a Javascript derived object database, which essentially pre provides a data structure framework to base API calls on. Learning how to develop a CMS from this represents, to me, a great learning experience.

Project layout

As I mentioned in a previous blog article, Google has a number of ways of accessing data from Drive directories, for this project I drilled down to two potential options for developing the blog.

Using Google Drive as a database to store Google Docs files, and using the Googleapi npm to provide a CDN, pulling data from the Google Drive, and using a Next.js framework to generate server side static web pages based on the data passed. Essentially creating a Jamstack.

Use Google Apps Script as a RESTful back end, turning the data from the Google drive into JSON format and pass the data via a web fetch request, then using Next.js framework as a lightweight React router to generate pages on load.

Ultimately, as I was starting out with both Google APIs and Next.js, and as this represented my first major project, I opted for the second option. My reasoning was that Apps Script provides essentially a Node.js server with an inbuilt solid testing environment. I felt like I was going to be doing a lot of testing while I got to grips with the system. Similarly this would be my first time using Next.js v13 with the app folder, and having a good foundation of React I felt that a light touch approach would assist me in getting the project done in a reasonable timeframe.

Proposed data structure, where blog articles are produced via Google docs and stored on Google Drive. Google Apps script is then used to turn the Docs file into a JSON format, which is then passed to a Next.js server that processes the data into html content.

Creating a backend with Google

Google Drive and Docs provide a simple database for storing articles, Google Apps Script provides a platform for generation of RESTful APIs

File structure

The first setup was to create a folder on My Drive called blog, in this folder I have 2 sub folders, drafts and articles. The drafts folder is for blog articles that are under development and won't show in the website, while the articles folder is where finished blogs are stored.

The Google Drive is used as the database to store the information (blogs) which are stored as Google Doc files

API calls, what data is needed and when?

After establishing the database location, the next stage of the project was to establish what data is needed for the website and when.

The proposed page structure for the blog was:

A landing page, which provided a list of the blogs

Individual blog pages which contain the content of the blog article in its entirety.

As such I determined that there are two specific API calls to be made.

First API call - list blogs

So now that the blog folder has been established on Google Drive, and the API calls have been established, it’s time to build the first API call.

Methodology for fetching

In a previous blog I provided examples for calling Google Apps Scripts functions via weblink, and I will use that method for this project. A separate Google Cloud Service project was set up and the Docs and Drive APIs were enabled, then a general Google Apps Script project was generated and associated with the project. We also need the Advanced services for Docs, but I will get to that later.

Because this Apps Script program will need to access more than a single document, a general project is required.

Data to be fetched

Now that the project has been established the first thing to do is get a list of Blog articles, for this we need

The folder id

A list of files in the folder

The folder id found in the web address, for my project the web address for the folder is:

https://drive.google.com/drive/folders/1fs11nkZ5cZNWQ73YC-Zttk2OTvC4k0Vn

So the id is everything after the last ‘/’ or ‘1fs11nkZ5cZNWQ73YC-Zttk2OTvC4k0Vn’. This folder is private and can’t be viewed by anyone bar myself, but as Apps Scripts requires my privileges to function it’s not needed to share with anyone else.

After the folder id has been established, getting a list of files can be achieved by opening the folder in the DriveApp and getting the files in the folder, the following code achieves this.

function doGet() {
  const id = '1fs11nkZ5cZNWQ73YC-Zttk2OTvC4k0Vn';
  const folder = DriveApp.getFolderById(id);
  const files = folder.getFiles();
}

This produces a FileIterator class that may not initially look that useful, Google employs the FileIterator because a getFiles() call could potentially pull thousands of files, which would be a huge call, and a waste of processing power. Instead the FileIterator provides 3 methods, a Continuation token, a hasNext() boolean which provides true if the file is not the last in the list, or false if it is, and a Next() which will pull the next file in the FileIterator.

The easiest way of processing the file list is to use the hasNext() boolean and a while loop,

while (files.hasNext()) {
  var file = files.next(); 
}

While this loop is going logging the ‘file’ variable will give you the file class with the name of the blog articles. As you can see from the Apps Script reference, aspects of the file can be taken, and here we’ll use the getFileId(), so that we can use the id to call individual blog articles when required. Additionally we can call the getLastUpdated() command to give us a timestamp to use for the blog.

This leads us with the following code:

while (files.hasNext()) {
  var file = files.next();
  const fileDate = file.getLastUpdated();
  const fileId = file.getId();
}

With this we can create a list of blog articles with names and the date it was finished. This is good, but I want the title page to show a bit more, specifically I’d like an image associated with the blog, and some of the text, say a portion of the first paragraph. So the second part of the Blog Listing function is a sub function to find the first picture in the blog and the first paragraph.

getDataFromDoc()

The function getDataFromDoc() takes the variable fileId and fileDate. While the fileDate isn’t used for any data fetching it will be required for the listing and will be part of the final return.

getFirstImageKey()

This function is split into further sub functions, getFirstImageKey() function scans the doc file for the first image displayed, and returns the inlineObjectId which can be used to pull further data about the image, specifically the contentUri which is a image URI with a lifetime of 30 mins, which will be used to display the images on the blog website, and the description which is used to provide the alt text if available.

function getFirstImageKey(body) {
  const listOfContent = body.content
  for (i = 0; i < listOfContent.length; i++) {
    if (listOfContent[i].paragraph) {
      if (listOfContent[i].paragraph.elements)
        if (listOfContent[i].paragraph.elements[0].inlineObjectElement)
          return listOfContent[i].paragraph.elements[0].inlineObjectElement.inlineObjectId
    }
  }
  return 'No image'
}

How this function works is that it iterates through the body.content of the doc, looking for ParagraphElements that have inlineObjectElements. As the body content requires structure it’s presented in an array, so the first image found will be the first in sequence of the document.

The ‘if’ chaining while cumbersome prevents return errors, because if you look for just listOfContent[i].paragraph.elements[0].inlineObjectElement if paragraph doesn’t exist then an error will break the function.

getIntroText()

The getIntroText function serves a similar function to the getFirstImageKey, except it takes the first Normal text styled paragraph in the document that has text in it, and returns the textRun content. Again if chaining is a simple way to prevent the system breaking

function getIntroText(body) {
  const listOfContent = body.content
  for (i = 0; i < listOfContent.length; i++) {
    if (listOfContent[i].paragraph) {
      if (listOfContent[i].paragraph.paragraphStyle) {
        if (listOfContent[i].paragraph.paragraphStyle.namedStyleType === "NORMAL_TEXT") {
          if (listOfContent[i].paragraph.elements[0].textRun) {
            if (listOfContent[i].paragraph.elements[0].textRun.content.length !== 1.0) {
              return listOfContent[i].paragraph.elements[0].textRun.content
            }
          }
        }
      }
    }
  }
  return 'No text'
}

Adding these to the doGet() function leads to this

function doGet() {
  const id = '1fs11nkZ5cZNWQ73YC-Zttk2OTvC4k0Vn';
  const folder = DriveApp.getFolderById(id);
  const files = folder.getFiles();
  const displayInfo = [];
  while (files.hasNext()) {
    var file = files.next();
    const fileDate = file.getLastUpdated();
    const fileId = file.getId();
    const fileData = getDataFromDoc(fileId, fileDate);
    displayInfo.push(fileData);
  }
  const returnValue = JSON.stringify(displayInfo);
  return ContentService.createTextOutput(returnValue).setMimeType(ContentService.MimeType.JSON)
}

As the drive directory is static, this fetch request doesn’t need any input value, and the return value is an array of objects with the following:

{
  "title": "Making This Blog",
  "image": {
    "imageDescription": "Googles suite is easy to use and powerful",
    "imageCuri":  "https://lh3.googleusercontent.com/c81hOQHheIyFMA4Fp_j68ZMbWS7RQWhu0ljnhirBneNNiOMuV1BIz86zftTtuNGyNKbC6IXBEI2iA6rZrwddCNp0pLMTJNDfXxGkDVzx-knFlWZ7PXRL_4QawazFHBcD9dp_Z74B5MKBROABx-f2-dtLWDqUZ11h"
  },
  "introText": "So one thing that a lot of coders do is write a blog. While I’m not normally a trend chaser, I do see the benefit of doing something like a blog to show your inner thoughts on coding and the like.\n",
  "fileId": "1VBdL5h6a8hiY7C3S3xBNrvElvBCpGQEwZQ8kEZusGhY",
  "fileDate": "2023-07-21T18:51:45.348Z"
},

Which is a JSON package I can work with to populate the front page of the blog.

Google Advanced Services

In my previous blog article I mentioned briefly Google Advanced Services as a way to access more functionality from Google APIs than Google Apps Scripts normally has available.

Because I made the choice to pull images directly from the docs this comes with several limitations. The major limitation is that the Uri image links are temporary, and can only be created via the full API system, and not the standard Google Apps Script.

I did investigate alternative methods and if I was to design the blog again I would probably opt to develop a script that transferred the images to a separate folder for linking. However the developed solution did fulfil my requirements.

Second API call - Blog content

The second required call is for specific blogs, where all the data in the blog is to be turned into a JSON to be passed via REST protocol.

Google Docs are essentially JavaScript objects with distinct structure that can be utilised to generate a JSON that can be passed.

Methodology for fetching

A new Google Apps Script project was developed

The setup of this Google Apps Script project used the same Google Cloud Service project, and the advanced docs service.

Data to be fetched

The first part of the project was to establish the format of the JSON file which will transfer data from the blog to the front end. As Google docs is being used as the word processor for blog writing due to its ability to allow for images and links to be inserted inline, it provides a repository that is easy to create blog articles.

Breaking it down to components, data that’s needed to be transferred is:

1. Text - with styling, including bold and italics as well as hyperlinks

2. Images

3. Code blocks - this is a coding blog afterall!

Aside, why not markdown?

While researching this, several blog sites use markdown as a method for developing and transferring text. Markdown has all of the features required for a Blog including linking image links code blocks and the like. Markdown is relatively easy to work with and it is understandable why a bunch of blog sites use it as the primary method for creating blocks.

There are two reasons why I opted not to use markdown as my word processing tool. The first is that I actually wanted to try something that I had seen done, but new was possible. And the second was that I wanted a system that a non coding person could use to produce blog-like articles, and I believed that it would be easier for a non-coding person to use a word processor like Google Docs then it would be to learn markdown.

A minor tertiary advantage to using Google Docs is the voice processing capacity of Google allowing the user to dictate blogs, at least for a first draft as I find it easier to dictate and type.

doGet()

First thing to do is to structure the basic doGet() fetch command, here we plan to trigger a generic fetch that will select the blog by id, thus the fetch request will have to include the id as a method of fetching. Using the id over the name is better, because the id is unique and the name could be the same for two blogs.

My doGet(e) code is structured as:

function doGet(e) {
  const folder = DriveApp.getFolderById('1fs11nkZ5cZNWQ73YC-Zttk2OTvC4k0Vn')
  if (e) {
    if (e.parameter) {
      if (e.parameter.id) {
        let id = e.parameter.id
        let data = getData(id)
        let returnValue = JSON.stringify(data)
        return ContentService.createTextOutput(returnValue).setMimeType(ContentService.MimeType.JSON)
      }
    }
  }

  {
    let noBlog = [
      {
        style: "HEADING_1",
        className: ["blog-text"],
        content: ["Error, blog not found!"]
      }
    ]
    returnValue = JSON.stringify(noBlog)
    return ContentService.createTextOutput(returnValue).setMimeType(ContentService.MimeType.JSON)
  }
}

Let’s break this down into individual components. The first thing the doGet function does is call the DriveApp folder. While not used directly, the DriveApp is required for permissions. Image Uris generated without the DriveApp permissions are different to ones generated, and we want the fully generated uris to pass to the blog site.

The subsequent if statement chain proofs the Input to detect if it does actually contain an id. Should it find an id it will then pass the id into the get data function which I will describe in the next chapter. There is then a catch that will pass a ‘blog not found’ JSON to the frontend Should the id be not present.

The getData function first tries to pull the document from Google Drive via the file id. If this is successful then it proceeds to get the body of the document and parses the document into two subfunctions, getDataFromTable or getDataFromParagraph.

The reason for this split is because code blocks are being stored in single cell tables objects. While text and images, inline or single are all located in paragraphs. As these are different objects in the Google Docs file, we can separate them sequentially and parse them as they appear in the body. Therefore the get data from table function is for code blocks and the get data from paragraph functions is for the rest.

Each of these subfunctions create objects that are structured to provide both the content and the styling information. These objects are pushed into an array that will become the returnValue.

Finally there is a catch function that allows for a return of a block not found should the document not exist due to an error in the id sent.

function getData(fileId) {
  try {
    const gdoc = Docs.Documents.get(fileId);
    const body = gdoc.body;
    const listOfContent = body.content;

    const returnValue = [];
    // Parse data from the paragraphs of the doc

    for (i = 0; i < listOfContent.length; i++) {
      if (listOfContent[i].table) {
        const table = getDataFromTable(listOfContent[i].table)
        returnValue.push(table);
      }
      if (listOfContent[i].paragraph) {
        if (listOfContent[i].paragraph.elements) {
          let paragraph = getDataFromParagraph(listOfContent[i].paragraph, gdoc)
          if (paragraph != null) {
            returnValue.push(paragraph);
          }
        }
      }
    }
    return returnValue
  }
  catch {
    let noBlog = [{
      style: "HEADING_1",
      className: ["blog-text"],
      content: ["Error, blog not found!"]
    }];
    returnValue = noBlog;
    return returnValue;
  }
}

GetDataFromTable()

The getDataFromTableFunction works on the principle that all tables will be single celled and therefore takes the first cell and parses all the textRun.content into an object with style, className and content keys. All the textRuns of a cell will be pushed into a single array and the array is returned as the content object.

function getDataFromTable(table) {
  const cell = (table.tableRows[0].tableCells[0].content)
  const cellValue = []
  for (paragraph in cell) {
    for (element in cell[paragraph].paragraph.elements) {
      cellValue.push(cell[paragraph].paragraph.elements[element].textRun.content)
    }
  }
  content = [cellValue]
  const returnValue = {style:"CODE", className:["blog-code-block"], content}
  return returnValue
}

getDataFromParagraph()

This function is more complicated than the getDataFromTable function because we need to access image contentUri to pass to the frontend. Additionally we need to take the style from the doc and pass that so that when the JSON is read by the front end it knows what tag to use when reconstructing the data.

The namedStyleType takes the basic styling in Google Docs (Title, Headings 1 to 6, subtitle) and makes for an easy to use tag to pass on. So the first object created is the style key.

Paragraphs in Google Docs are made of multiple elements which include inlineObjectElement as well as textRuns. So the process procedurally moves through the potential outputs and appends them to arrays that will be passed as objects.

One thing that the process does is remove empty arrays, this is so that standardised formatting is achieved.

function getDataFromParagraph(paragraph, gdoc) {
  // get style for tag
  const style = paragraph.paragraphStyle.namedStyleType
  const elements = paragraph.elements

  const content = []
  const className = []
  for (element in elements) {
    if (elements[element].inlineObjectElement) {
      className.push("blog-image")
      const imageKey = elements[element].inlineObjectElement.inlineObjectId
      imageCuri = gdoc.inlineObjects[imageKey].inlineObjectProperties.embeddedObject.imageProperties.contentUri
      imageDescription = (gdoc.inlineObjects[imageKey].inlineObjectProperties.embeddedObject.description)
      content.push({imageCuri, imageDescription})
    }
    else if (elements[element].textRun) {

      // remove empty content elements
      if (elements[element].textRun.content.length !== 1.0) {

        //add link data
        if (elements[element].textRun.textStyle.link) {
          className.push("link")
          const linkcontent = {link: elements[element].textRun.textStyle.link.url, text: elements[element].textRun.content};
          content.push(linkcontent)
        }

        //add italics data
        else if (elements[element].textRun.textStyle.italic) {
          className.push("italics")
          content.push(elements[element].textRun.content)
        }

        //add bold data
        else if (elements[element].textRun.textStyle.bold) {
          className.push("bold")
          content.push(elements[element].textRun.content)
        }

        //add plain text
        else {
          className.push("blog-text")
          content.push(elements[element].textRun.content)
        }
      }
    }
  }
  const returnValue = {style, className, content};
  if (returnValue.content.length !== 0) {
    return returnValue
  }
  return
}

The JSON

The Jason that is Returned by the get app script has the following structure, essentially is an array full of objects with the objects having a style class name and content. content for images includes the image curi and background description should that be available, while text is passed along with style such as italics bold etc as well as headings and links.

Below is an example of the style, for a non found blog.

[
  {
    "style": "HEADING_1",
    "className": [ "blog-text"],
    "content": ["Error, blog not found!"]
  }
]

Conclusion

When I set out to create a Blog to better showcase my coding, I initially looked at a number of pre-packaged systems that I could build full stop in the end as I had a project that involved using Google Drive already on my table, I decided that I would use the knowledge that I was gaining to develop biome blog using Google Docs as the word processor karma Google Drive as the database, Google app script as the back end, and Nextjs as the front end.

I feel that this was one of the most interesting projects that I have undertaken, and I needed to really get to grips with the concept of restful API structure as well as the object structuring of Google Docs. Google app scripts provides a free to use API with a custom set of tools, and although this could have been done with Nextjs using node architecture, I felt that learning Google app script increased my repertoire and also allowed me to work with a different coding environment and different restrictions.

This blog concludes with the development of the backend, in my next blog I will just describe how the front end was built. As is always the way, I feel that I have learned eight substantial amounts and should I start this process again, I would not make all the choices that I originally made. However, because no project is ever truly completed I felt that I needed to move on to newer and exciting projects. At the end of the day Google Apps Script accomplishes everything that I need it to do in providing RESTful API via a JSON, and operating as a lightweight CMS for my blogs. All I need to do is drop a completed blog into the appropriate folder comma and when I navigate to my web page it is loaded and no further work is required on my end.

About

Keith is a full stack developer with a PhD in Tissue Engineering and experience in startup generation and technology transfer. Analytical thinking and a love of problem solving, he's happiest when he's halfway up a climbing project.