So this blog is built in Gatsby and its content is provided by Drupal 8. I learned a lot by building it, but I still had something on my backlog: show Drupal inline media in the Gatsby frontend.

Enter Drupal 8.8
One of the great improvements in 8.8 is a media library and integration of this library, making it easy to embed media for the editor.

Yay! That's definitely something I need in my life to make it easier to embed pictures:

My parents dog - a great white shepherd (also great as whiteboard)

The alternative could be using paragraphs and building integration for that which would also be possible but for my blog I just wanted to keep everything as simple as possible (and while very flexible, paragraphs bring a lot of overhead and complexity).

How can we do this? Until now I just used the processed value from Drupal, which was safe when the appropriate format filters are used on Drupal side:

<div className="page-body" dangerouslySetInnerHTML={{ __html: entity.body.processed }}></div>

Drupal also provides a filter to embed media. This filter processes the HTML looking for any media tags and transforming them into the correct media item. In the case of the doggo above the raw value in the editor looks like this:

<drupal-media data-align="center" data-caption="My parents dog - a great white shepherd (also great as whiteboard)" data-entity-type="media" data-entity-uuid="16954f2e-e03c-4d00-85a6-5be1cde42833" data-view-mode=""></drupal-media>

But if we keep using the processed value, it will get replaced by an image url in Drupal and Gatsby loses the reference to the media item. So the idea is to use the raw value from the textfield and apply our own filters for sanitization and transformation.

The raw value is easily retrieved by adding it in your graphql query:

...
      body {
        processed
        value
      }
...

Turns our there is a great parser we can use called react-html-parser. This will not only help us with transforming the media but also removes the need for using dangerouslySetInnerHTML. Great!

I decided to make a Body component which encapsulates all the logic and can be reused. I also created a Media component and a component for each media bundle type I use (image, video, remote-video, etc).

body.js

import React from "react";
import { useStaticQuery, graphql } from "gatsby";
import ReactHtmlParser from "react-html-parser";
import sanitizeHtml from "sanitize-html";
import Media from "../media/media";

const Body = (props) => {

  const mediaItems = useStaticQuery(graphql`
    query mediaQuery {
      drupal {
        mediaQuery {
          entities {
            entityId
            entityBundle
            entityUuid
            ... on Drupal_MediaImage {
              mid
              uuid
              fieldMediaImage {
                mediaGatsbyFile {
                  publicURL
                  childImageSharp {
                    fluid {
                      ...GatsbyImageSharpFluid
                    }
                  }
                }
                entity {
                  url
                }
              }
            }
            ... on Drupal_MediaVideo {
              mid
              uuid
              fieldMediaVideoFile {
                mediaGatsbyFile {
                  publicURL
                }
                entity {
                  url
                }
              }
            }
            ... on Drupal_MediaRemoteVideo {
              mid
              uuid
              fieldMediaOembedVideo
            }
          }
        }
      }
    }
  `);

  const processBody = (body) => {
    body = sanitizeHtml(body, { 
      allowedTags: sanitizeHtml.defaults.allowedTags.concat([ 'drupal-media' ]),
      allowedAttributes: Object.assign({ 'drupal-media': ['data-*'] }, sanitizeHtml.defaults.allowedAttributes),
    });
    return ReactHtmlParser(body, {
      transform: (node) => {
        if (node.type === 'tag' && node.attribs['data-entity-type'] === 'media') {
          let uuid = node.attribs['data-entity-uuid']
          for (let media of mediaItems.drupal.mediaQuery.entities) {
            if (media.uuid === uuid) {
              return <Media key={uuid} attr={node.attribs} media={media}/>
            }
          }
        }
        else {
          return undefined;
        }
      },
    })
  }

  return (
    <div className={props.className}>{processBody(props.children)}</div>
  )
};

export default Body;

First, I sanitize the data because we are now using the raw value. So sanitizeHTML is the equivalent of the Drupal filter 'Limit allowed HTML tags and correct faulty HTML'. I just needed to add drupal-media and its attributes to the allowed list or else they would get removed.

react-html-parser does not prevent XSS - sanitize your data

Then we use ReactHtmlParser which has an object as second argument with two available functions: preprocessNodes and transform. I'm only using the transform function: it looks for media tags by checking the data-entity-type attribute. If found, it retrieves the UUID from the data-entity-uuid attribute and it uses that ID to look for a match in the mediaItems which we queried for.

Note: In the query above I only queried for images, videos and remote videos since I didn't needed documents or audio.

If a match is found, the media entity object is passed to the Media component along with the attributes. The attributes contain stuff like captions, alt text, alignment and view-mode. Everything is returned as a bunch of React elements, which is why we don't need to use dangerouslySetInnerHTML anymore.

In case of an image like above the media-image component is used for displaying the image, and we can leverage Gatsby's Img component using the data from the fluid query,

const MediaImage = (props) => {

  const { attr, fluid } = props;

  return(
    <div className={`media media--image media--align-${attr['data-align'] || 'none'}`}>
      <Img 
        alt={attr.alt}
        fluid={{
        ...fluid
      }} />
      {attr['data-caption'] &&
        <div className="caption">{attr['data-caption']}</div>
      }
    </div>
  )
}

I don't actually use view modes but it would be easy to implement various displays based on the attribute value. One thing you might wonder is why don't we query for a specific media item since we know the ID? That's something I also mentioned in the other blog post and is a limitation in the way staticQueries work: they can't use variables. But it is something which probably will be possible in the future.

Let me know what you think or if you have a better way to implement inline media images from Drupal in Gatsby!