Blog RSS Feed

Archive for the ‘AI’ Category

Using Machine Learning to Enhance the Resolution of Bible Maps

Friday, March 1st, 2019

In a previous post, I discussed how 3D software could improve the resolution of Bible maps by fractally enhancing a digital elevation model and then synthetically creating landcover. In this post I’ll look at how machine learning can increase the resolution of freely available satellite images to generate realistic-looking historical maps.

Acquiring satellite imagery

The European Sentinel-2 satellites take daily photos of much of the earth at a ten-meter optical resolution (i.e., one pixel represents a ten-meter square on the ground). The U.S. operates a similar system, Landsat 8, with a fifteen-meter resolution. Commercial vendors offer much higher-resolution imagery, similar to what you find in Google Maps, at a prohibitive cost (thousands of dollars). By contrast, both Sentinel-2 and Landsat are government-operated and have freely available imagery. Here’s a comparison of the two, zoomed in to level 16 (1.3 meters per pixel), or well above their actual resolution:

Sentinel-2 shows more washed-out colors at a higher resolution than Landsat 8.

The Sentinel-2 imagery looks sharper thanks to its higher resolution, though the processing to correct the color overexposes the light areas, in my opinion. Because I want to start with the sharpest imagery, for this post I’ll use Sentinel-2.

I use Sentinel Playground to find a scene that doesn’t have a lot of clouds and then download the L2A, or atmosphere- and color-corrected, imagery. If I were producing a large-scale map that involved stitching together multiple photos, I’d use something like Sen2Agri to create a mosaic of many images, or a “basemap” (as in Google Maps). (Doing so is complicated and beyond the scope of this post.)

I choose a fourteen-kilometer-wide scene from January 2018 showing a mix of developed and undeveloped land near the northwest corner of the Dead Sea at a resolution of ten meters per pixel. I lower the gamma to 0.5 so that the colors approximately match the colors in Google Maps to allow for easier comparisons.

The Sentinel-2 scene.

Increasing resolution

Enhance!” is a staple of crime dramas, where a technician magically increases the resolution of a photo to provide crucial evidence needed by the plot. Super-resolution doesn’t work as well in reality as it does in fiction, but machine learning algorithms have increased in their sophistication in the past two years, and I thought it would be worth seeing how they performed on satellite photos. Here’s a detail of the above image, as enlarged by four different algorithms, plus Google Maps as the “ground truth.”

Comparison of four different super-resolution algorithms plus Google Maps, as discussed in the following paragraphs.

Each algorithm increases the original resolution by four times, providing a theoretical resolution of 2.5 meters per pixel.

The first, “raw pixels,” is the simplest; each pixel in the original image now occupies sixteen pixels (4×4). It was instantaneous to produce.

The second, “Photoshop Preserve Details 2.0,” uses the machine-learning algorithm built into recent versions of Photoshop. This algorithm took a few seconds to run. Generated image (1 MB).

The third, ESRGAN as implemented in Runway, reflects a state-of-the-art super-resolution algorithm for photos, though it’s not optimized for satellite imagery. This algorithm took about a minute to run on a “cloud GPU.” Generated image (1 MB).

The fourth, Gigapixel, uses a proprietary algorithm to sharpen photos; it also isn’t optimized for satellite imagery. This algorithm took about an hour to run on a CPU. Generated image (6 MB).

The fifth, Google Maps, reflects actual high-resolution (my guess is around 3.7 meters per pixel) photography.

Discussion

To my eye, the Gigapixel enlargement looks sharpest; it plausibly adds detail, though I don’t think anyone would mistake it for an actual 2.5-meter resolution satellite photo.

The stock ESRGAN enlargement doesn’t look quite as good to me; however, in my opinion, ESRGAN offers a lot of potential if tweaked. The algorithm already shows promise in upscaling video-game textures–a use the algorithm’s creators didn’t envision–and I think that taking the existing model developed by the researchers and training it further on satellite photos could produce higher-quality images.

I didn’t test the one purpose-built satellite image super-resolution algorithm I found because it’s designed for much-higher-resolution (thirty-centimeter) input imagery.

Removing modern features

One problem with using satellite photos as the base for historical maps involves dealing with modern features: agriculture, cities, roads, etc., that weren’t around in the same form in the time period the historical map is depicting. Machine learning presents a solution for this problem, as well; Photoshop’s content-aware fill allows you to select an area of an image for Photoshop to plausibly fill in with similar content. For example, here’s the Gigapixel-enlarged image with human-created features removed by content-aware fill:

Modern features no longer appear in the image.

I made these edits by hand, but at scale you could use OpenStreetMap’s land-use data to mask candidate areas for content-aware replacement:

Data from OpenStreetMap shows roads, urban areas, farmland, etc.

Conclusion

If you want to work with satellite imagery to produce a high-resolution basemap for historical or Bible maps, then using machine learning both to sharpen them and to remove modern features could be a viable, if time-consuming, process. The image in this post covers about 100 square kilometers; modern Israel is over 20,000 square kilometers. And this scene contains a mostly undeveloped area; large-scale cities are harder to erase with content-aware fill because there’s less surrounding wilderness for the algorithm to work with. But if you’re willing to put in the work, the result could be a free, plausibly realistic, reasonably detailed map over which you can overlay your own data.

Google Will Now Answer Your Theological Questions

Saturday, April 14th, 2018

Google just announced an AI-powered experiment called Talk to Books, which lets you enter a query and find passages in books that are semantically similar to your query, not merely passages that happen to match the keywords you chose. For theology- and Bible-related questions, it often presents an evangelical perspective, perhaps because U.S. evangelical publishers have been eager for Google to index their books.

Here are some questions I asked it, with a sample response (not always the first):

Does God exist? “Creatures may or may not exist; God must exist; He cannot not exist.” — The Catholic Collection.

Why does a good God allow suffering? “Either you somehow deny the world’s suffering (that is, suffering is eventually shown to belong to a higher order of goodness) or else one or more of God’s characteristics (existence, benevolence, omnipotence) are denied.” — A Philosophy of Evil.

When does the rapture happen? “Depending upon one’s view, the rapture occurs either before, during, or after a seven-year period of intense trial and trauma on earth known as the tribulation, as recorded in Revelation 6-19.” — Armed Groups: Studies in National Security, Counterterrorism, and Counterinsurgency.

Where is Jesus now? “Wherever you are as you read these words, he is present.” — And the Angels Were Silent. Some of the other answers, like “He is on the shore of the Sea of Galilee with Andrew and other apostles,” are on the strange side–even in context, the answer is wrong, as this sentence is talking about Peter, not Jesus.

It totally whiffs on Who is Abraham’s father? Rather than interpreting the question and providing a factual answer, it presents a number of passages describing how Abraham is the father of Isaac or of Isaac’s descendants. These passages relate semantically but don’t answer the question.

Answers to 'What is the role of the Holy Spirit' include responses from an NKJV study Bible and Billy Graham.

Rise of the Robosermon

Sunday, April 29th, 2012

In a recent issue of Wired, Steven Levy writes about Narrative Science, a company that uses data to write automated news stories. Right now, they mostly deal in data-intensive fields like sports and finance, but the company is confident that it will easily expand into other areas—the company’s co-founder even predicts that an algorithm will win a Pulitzer Prize in the next five years.

In February 2012, I attended a session at the TOC Conference given in part by Kristian Hammond, the CTO and co-founder of Narrative Science. During the session, Hammond mentioned that sports stories have a limited number of angles (e.g., a “blowout win” or a “come-from-behind victory”)—you can probably sit down and think up a fairly comprehensive list in short order. Even in fictional sports stories, writers only use around sixty common tropes as part of the narrative. Once you have an angle (or your algorithm has decided on one), you just slot in the relevant data, add a little color commentary, and you have your story.

At the time, I was struggling to understand how automated content could apply to Bible study; Levy’s article leads me to think that robosermons, or sermons automatically generated by a computer program, are the way of the future.

Parts of a Robosermon

Futurama has a robot preacher. I've never seen these episodes, so hopefully this image isn't terribly heretical. After all, from a data perspective, sermons don’t differ much from sports stories. In particular, they have three components:

First, as with sports stories, sermons follow predictable structures and patterns. David Schmitt of Concordia Theological Seminary suggests a taxonomy of around thirty sermon structures. Even if this list isn’t comprehensive, it would probably take, at most, 100 to 200 structures to categorize nearly all sermons.

Second, sermons deal with predictable content: whereas sports have box scores, sermons have Bible texts and topics. A sermon will probably deal with a passage from the Bible in some way—the 31,000 verses in the Bible comprise a large but manageable set of source material (especially since most sermons involve a passage, not a single verse; you can probably cut this list down to around 2,000 sections). Topically, SermonCentral.com lists only 500 sermon topics in their database of 120,000 sermons. The power-law popularity distribution (i.e., the 80/20 rule) of verses preached on (on SermonCentral.com are 1,200 sermons on John 1 compared to seven on Numbers 35) and topics (1,400 sermons on “Jesus’ teachings” vs. four on “morning”) means that you can categorize most sermons using a small portion of the available possibilities.

Third, sermons generally involve illustrations or stories, much like the color commentary of sports stories. Finding raw material for illustrations shouldn’t present a problem to a computer program; a quick search on Amazon turns up 1,700 books on sermon illustrations and an additional 10,000 or so on general anecdotes. You can probably extract hundreds of thousands of illustrations from just these sources. Alternately, if a recent news story relates to your topic, the system can add the relevant parts to your sermon with little trouble (especially if a computer wrote the news story to begin with).

Application

You end up being able to say, “I want to preach a sermon on Philippians 2 that emphasizes Christ’s humility as a model for us.” Then—and here’s the part that doesn’t exist yet but that technology like Narrative Science’s will provide—an algorithm suggests, say, an amusing but poignant anecdote to start with, followed by three points of exegesis, exhortation, and application, and finishing with a trenchant conclusion. You tweak the content a bit, throwing in a shout-out to a behind-the-scenes parishioner who does a lot of work but rarely receives recognition, and call it done.

Why limit sermons to pastors, though? Why shouldn’t churchgoers be able to ask for custom sermons that fit exactly their circumstances? “I’d like a ten-part audio sermon series on Revelation from a dispensational perspective where each sermon exactly fits the length of my commute.” “Give me six weeks of premarital devotions for my boyfriend and me. I’ve always been a fan of Charles Spurgeon, so make it sound like he wrote them.”

Levy opens his Wired article with an anecdote about how grandparents would find articles about their grandchildren’s Little League games just as interesting as “anything on the sports pages.” He doesn’t mention that what they really want is a recap with their grandchild as the star (or at least as a strong supporting character—it’s like one of those children’s books where you customize the main character’s name and appearance). Robosermons let you tailor the sermon’s content so that your specific problems or questions form the central theme.

The logical end of this technology is a sermonbot that develops a following of eager listeners and readers, in the same way that an automated newspaper reporter would create fans on its way to winning a Pulitzer.

You may argue that robosermons diminish the role of the Holy Spirit in preparing sermons, or that they amount to plagiarism. I’m not inclined to disagree with you.

Conclusion

Building a robosermon system involves five components: (1) sermon structures; (2) Bible verses; (3) topics; (4) illustrations; and (5) technology like Narrative Science’s to put everything together coherently. It would also be helpful to have (6) a large set of existing sermons to serve as raw data. It’s a complicated problem but hardly an insurmountable one over the next ten years, should someone want to tackle it.

I’m not sure they should; that way lies robopologetics and robovangelism.

If you’re not an algorithm and you want to know how to prepare and deliver a sermon, I suggest listening to this 29-part course on preaching by Bryan Chapell at Biblical Training. It’s free and full of homiletic goodness.