It's a big world out there

Kimi and Jerilyn’s mother continued their bedtime story, “And then Adam the bat's friend, Jane the fish, showed up and stuck her head out of the water. Adam hastily finished chewing the beetle he was eating before exclaiming, 'Jane! Where've you been? We haven't seen each other in ages.' 'I've been exploring a marvelous new worlds. I wish I could show you,' said Jane the fish. 'Who says you can't? I can't swim, but you could carry me,' said Adam the bat. 'I suppose so. You'll have to hold your breath, though,' said Jane. So Adam took in a big breath of air, and held on to Jane's back. 'Hold on tight!' said Jane, and plunged below the surface. She swam, and swam, and swam, for what seemed like eternity, especially to the air-breather on her back trying to hold his breath. Finally, just as Adam was about to run out of breath, Jane the fish surfaced again, and Adam the bat took in a big breath of fresh air. There wasn't a rock ceiling above their heads, just empty space all the way up. And far, far, far above the ground there was an enormous intense light, so bright that neither bat nor fish could look directly into it without hurting their eyes, and it illuminated their surroundings so much that they could see far around them from light alone, without echolocation. Adam shook his wings dry, took off, flew up, and kept climbing higher into the air, with no rock ceiling above him to limit his ascent. Eventually he got spooked by how far he'd gotten from any solid object, so he started flying back down, and returned to Jane in the water. That's all for now, dears. Sleep tight.” Their mother kissed each of them on the forehead.

“Mom,” asked Kimi, “could there actually be a light that bright?”

“I don't know,” she answered, “but according to ancient myth, there is such a thing. Or was, at least. I suppose there's no way of knowing whether it's still around. It's in a far away world with no rock ceiling too, so goes the myth. Sweet dreams.” Their mother left.

“No rock ceiling,” Kimi whispered, “That's even wilder than the thing about the light. Like, would it just be air all the way up forever? Surely there'd have to be an end somewhere, right?”

“Maybe there is a rock ceiling there, but it's so high up that you can't hear the echo,” Jerilyn suggested.

“Wow, that would be so disorienting, not being able to hear the echo off the rock ceiling,” said Kimi.

“Given what Mom said about the light, maybe you could see the rock ceiling even if you couldn't echolocate it,” said Jerilyn.

“But I guess if people thought there wasn't a rock ceiling at all, it must be high enough that you can't see it either,” said Kimi.

“I guess so,” Jerilyn agreed.

“Jerilyn,” said Kimi.

“What?”

“Do you think it's real?”

“No,” said Jerilyn.

“Are you sure?”

Jerilyn hesitated. “No,” she said. She really had no way of knowing for sure, however outlandish it may sound.

“Jerilyn.”

“What?”

“I'm not actually feeling all that tired. Are you?”

“Eh, somewhat, but not especially.”

“Let's go find the place Mom was talking about.”

Jerilyn thought about it. On the one hand, the mythical place probably didn't exist, and even if it did, there was no way they were going to find it. On the other hand, an adventure might be fun. “Let's do it,” she said.

They snuck off and made their way to their canoe. They avoided making sounds so as not to advertise their presence, so they had to rely on touch to find their way, but they knew the route well enough that that wasn't a huge impediment.

They set off, and as they knew the waters immediately surrounding the dock by heart, they were able to navigate away from the island silently, but once they were a ways out, Kimi started making clicking noises with her tongue so they could echolocate their surroundings. They aimed straight for the closest point where the rock ceiling met the water. They couldn't echolocate that far, of course, but Jerilyn remembered the way from her navigation lessons. Once they got too far from the archipelago, they had to rely on trying to keep going in a straight line, but soon after, they encountered the wall.

“What now?” asked Kimi.

“I suppose we look around for a tunnel,” said Jerilyn.

They turned right and followed where the rock ceiling met the water, keeping it on their left, their casual conversation sufficing to provide enough noise for them to track their surroundings. They never found a tunnel. Eventually they got tired, pointed their canoe back in the direction they came from, and set off for home. When they first encountered an island, they weren't sure which one it was, and they went all the way around it in a circle so they could estimate its size and shape. It seemed unfamiliar, but Jerilyn thought back to her navigation lessons, and by the time they had completed their circle around the island, she came up with a guess as to which island it was. If she was right, they were significantly off course. She turned the canoe in the direction she thought home was, and when they passed the next island, she gained confidence that she was right, and indeed, their new path took them straight home, where they docked the canoe, dried themselves off, went straight to bed, and each fell asleep instantly.

*

Kimi and Jerilyn made several more expeditions to find tunnels to new worlds, taking off in different directions each time. On their fourth trip, they found an indentation well into the rock, which tapered out into a vein of air sticking just above the surface of the water. They got as far in as they could, until the rock ceiling got too low for them to stay under while they were sitting in the canoe. They stashed their paddles in the canoe, and carefully got out and swam farther in while towing the canoe. They soon reached a point where the canoe couldn’t go any farther even without them inside. They found a part of the rock ceiling that jutted down below the rest, and they pulled the end of their canoe downwards, pushed it under the jutting rock, and released it, so that the jutting rock extended into the canoe and would keep it from floating away.

They swam in further. But soon even the indentation they found sunk below the surface of the water. They each took a big breath of air, and kept swimming farther out underwater. They hadn’t brought sonar rods or a light, and couldn’t snap underwater, so they had no way of echolocating underwater, and had to rely on touching the rock ceiling above them to tell where it was. They didn't get very far before Jerilyn decided that that wasn't a great idea. She turned back, and pushed Kimi to turn back as well. Even with Jerilyn's caution, they were both somewhat short of breath by the time they could get their noses back into the air.

On their next trip, they brought a pair of sonar rods, and aimed for the same indentation they had found on their previous trip. When they arrived at where the rock ceiling met the water, they were in unfamiliar territory. On their previous trip, they had been keeping the line where the rock ceiling met the water to their left as they’d followed it until finding the indentation, and this time, they’d tried going a bit to the right of the course they’d taken on the previous trip in an attempt to go more directly to the indentation, so they figured that they’d overcorrected, and turned left. They soon found the indentation again.

Again, they went as far as they could while keeping their heads above water, Kimi carrying the sonar rods. Then they dove down into the water, much deeper than necessary just to stay below the rock, so that they would be able to echolocate as far as possible without the nearby part of the rock getting in the way, and Kimi rang the sonar rods.

The rock ceiling’s descent flattened out not long after the last of it passed below the surface of the water, and there was a small air pocket just a bit after the rock ceiling flattened out. About twice as far in as the air pocket, the rock ceiling started to pitch up sharply.

They swam up back towards the surface for air, Kimi ringing the sonar rods between armstrokes so they could keep track of where the air was instead of bumping into the rock ceiling.

“Let’s check out that air pocket,” Kimi suggested, after they surfaced.

“Not a good idea,” said Jerilyn.

“We can totally make it there,” said Kimi.

“Air pockets sometimes have bad air in them. We could get there, of course, but I’m not so sure we could make it back after coughing out nasty air,” Jerilyn explained.

Kimi reluctantly agreed not to explore the air pocket, and they turned back.

*

On their next trip, they brought buckets. They figured if they weren’t sure they’d have enough air in their lungs for the trip to the air pocket and back, they could bring some more air outside their lungs.

When they’d gotten as far as they could while keeping their heads above water, they quickly discovered that it was just about impossible to swim underwater while carrying a bucket full of air. After a long while trying, they figured out how to get themselves positioned upside-down in the water with their feet against the rock ceiling while holding a bucket full of air pulling them up against the rock ceiling, so they could walk along it. Both of them still had trouble carrying a bucket and a pair of sonar rods underwater at the same time, so they’d put their sonar rods back in the canoe. But they were able to make enough sound to echolocate their immediate surroundings by hitting the sides of their buckets.

They both needed a breath by the time they got to the air pocket, as walking upside-down underwater was much slower than swimming. So they found flat portions of the rock ceiling to put their buckets down on, then turned around, exhaled, stuck their heads in their respective buckets, and took a breath. Then they exited the buckets, and Kimi approached the air pocket. She stuck her hand in, and made contact with the rock almost instantly; it was, evidently, a very shallow pocket. She stuck her nose in, being careful not to rise high enough to hit the rock ceiling, and, heeding Jerilyn’s warning, cautiously took a small breath of air. It was rancid. She coughed it up and recoiled out of the pocket, then scrambled for her bucket while fighting the urge to inhale. She finally got her head in the bucket, took deep breaths and kept coughing, while Jerilyn held her up so she could focus on regaining her breath instead of swimming.

By the time Kimi got her breathing under control, the air in the bucket was quite stale and she was short of breath again, so she left the bucket for her big sister to deal with while she swam back to fresh air. Jerilyn took another breath from her own bucket, dumped the remaining air out of the buckets, and swam back while carrying them, which took her a lot longer than it took Kimi because of the drag caused by the buckets.

“You were right. That was nasty,” Kimi commented, once Jerilyn surfaced.

They decided to make another trip underwater to try to explore past the ridge where the rock ceiling pitched back up again. They retrieved their sonar rods and tied them to Kimi’s wrist to make them easier to carry at the same time as the buckets, and set off in the same direction as before.

They set their buckets down near the air pocket, each took a breath, and then swam out to the ridge. Another ring of the sonar rods revealed that the rock ceiling pitched straight up into a vertical cliff, and that there was a wide expanse of air about thirty feet above them.

They retreated to their buckets, each took a breath from them, and then dumped the remaining air out and swam back with their buckets.

“I don’t understand how the water went so high up. The surface is definitely much higher on the other side than it is here,” said Jerilyn, after they surfaced again.

“Yeah, weird, isn’t it? Also, how are we supposed to get there? We canoed around the edge for miles in each direction and didn’t find any tunnels or places where it bends around or anything that could lead to that place,” said Kimi.

“There probably isn’t any route there going over the surface. If there was, it would be even harder to understand why the water level is different there than here,” said Jerilyn.

“A completely separate world! Do you think it’s the place Mom told us about?”

“I don’t know.”

*

They left for home, and on their next trip, they brought four buckets, with the intention of going all the way to the surface on the other side of the rock. Then they repeated the previous expedition’s trick of walking upside-down underwater with buckets of air, this time each carrying a bucket in each hand, which was even harder to get into position for, but eventually they figured it out. This made it not only difficult to ring the sonar rods, but also difficult to hit the buckets, and they resorted to periodically letting their buckets hit the rock ceiling to make enough noise to navigate.

They stopped briefly near the air pocket to turn rightside-up and take a breath from their buckets, and then turned back upside-down and kept going, buckets still in hand, all the way until the point where the ridge pitched back up again. They set down the buckets in stable locations, turned rightside-up, exhaled, took deep breaths from their full buckets, and swam up towards the surface, leaving four half-filled buckets on the underside of the ridge behind them, Kimi periodically ringing the sonar rods on their way up so they wouldn’t collide with the rock.

They surfaced and each began to take a deep breath, then stopped in shock, and cautiously started to breath again. Something was off about the air. It smelled... not stale, exactly, but strange, not like any air they'd ever smelled before. It smelled overly fresh, in a way, as if all the air they’d beathed until that point had been a bit stale, and they hadn’t noticed.

Jerilyn raised a hand out of the water, shook some water off of it, and snapped. For the briefest of instants, they both thought that perhaps there wasn't a rock ceiling above them at all. But then they heard the echo, and realized that there was a rock ceiling above them at perhaps three times the height that they were accustomed to at home. And they couldn’t see any bright lights in the sky, or anything at all for that matter, so they couldn’t be in the place Mom had described in the myth. Aside from the rock on one side of them and bending into a ceiling far above them, there was nothing around them, just water for as far as they could hear.

“We gotta get the canoe in here so we can explore this place,” said Kimi.

“How in the world are we going to do that?” asked Jerilyn, realizing as she spoke that perhaps it should have been “how out of the world” rather than how in it.

“I don’t know,” said Kimi.

They swam around a bit, but didn’t find anything interesting, and decided to go home. They dove down under the ridge, retrieved their buckets and inhaled from them, surfaced on the other side, got in their canoe, and headed home.

*

Later, they did some experimenting at home, and discovered that their canoe was almost exactly the same density as water. Armed with this fortuitus fact, several buckets, and a lot of rope, they set off again for the other world.

A test run revealed that their rope wasn’t quite long enough to stretch from where they could park their canoe to the air on the other side of the rock. Finding this out resulted in Jerilyn dropping the rope on her way up after crossing the ridge so she could surface and breath, and then returning to their canoe, and they reeled the rope back in.

They set up three buckets of fresh air on the underside of the ridge, and one by the air pocket. Then Jerilyn took the sonar rods and swam out to the ridge and treaded water with her head in a bucket while Kimi filled the canoe with water, and pushed it underwater and forward, Jerilyn ringing the sonar rods in the water to help Kimi tell what she was doing as she swam under, and periodically ducking down into the water to keep herself updated on Kimi’s progress. Kimi wasn’t getting very good resolution from the sonar rods, but it helped that she remembered the path. Kimi, pushing the canoe ahead of her, reached the bucket by the air pocket and took a breath in it. Jerilyn took one last big breath from a fresh bucket and took off for the surface as Kimi continued forward pushing the canoe. When Jerilyn surfaced, she was able to help by reeling in the canoe, holding onto the rock cliff for leverage. Kimi went ahead of the canoe so she wouldn’t run out of air, and together they finished reeling in the canoe to the surface. With some difficulty, they emptied the water out of the canoe, righted it, and got back in.

Righting the canoe had been a lot of work, and they took a quick break to catch their breath. Then they set off in their canoe, keeping their old world to their left.

They heard sounds of civilization coming to them before they echolocated the island from their own snapping. They turned towards it and approached. They were noticed, and it seemed that they had caused a fair amount of consternation.

They got close, and a man was standing on the end of a peninsula near them holding a long, straight stick, facing them and snapping repeatedly. There were also boulders sticking above the water a ways to either side of them.

“Hello,” said Kimi, “Who are you? I’m Kimi.”

The man did not respond, but he did stop snapping and started clicking his tongue. The tongue-clicking wasn’t giving them good resolution on him, but they could tell he was moving in some way. Jerilyn snapped, revealing that the man had both hands on the stick, which was pointed at them, and he was leaning back as if about to throw it. Jerilyn dug her paddle into the water and swung them around, just as the man threw the stick. It narrowly missed Kimi.

“Hey, what was that for?” Kimi shouted.

“Kimi, paddle forward hard!” said Jerilyn, as she began to do so herself. They heard splashing sounds to either side of them, followed by the sounds of people swimming towards them. The man on the shore began clicking his tongue again, and seemed to be preparing for another throw. Jerilyn swung the canoe around again, and the stick just missed her. She resumed paddling forward, and the man on the shore dove into the water.

Someone grabbed the back of the canoe near Jerilyn and pulled himself up towards her. She moved her paddle between them just in time to block a thrust of a stick towards her. He grabbed her paddle with the hand that hand been on the canoe. Kimi lunged at him and hit him in the neck with her paddle with a surprising amount of force for someone her size. He dropped Jerilyn’s paddle as well as his own stick and fell back into the water. The recoil from Kimi’s lunge caused their canoe to collide with someone else as he pulled up towards the position Kimi had just left. Jerilyn hit him over the head with the edge of her paddle, and he too lost his grip on the boat. Jerilyn pushed him away from the boat with her paddle while he was too disoriented to grab it, and then Kimi and Jerilyn returned to their former positions and kept paddling hard. No one caught up to them, and they relaxed a bit once their pursuers had given up.

It took a while before they encountered the next sign of civilization. They approached much more cautiously this time, coming to rest at shouting distance. A small gaggle of people were gathered at the shore closest to them.

“Hello!” one of them shouted. They sounded funny.

“Hello!” Kimi shouted back.

“What is that thing?” asked the person on the shore. Their words were tricky to understand.

“What thing?” asked Kimi.

“The thing you’re sitting on floating in the water,” the stranger clarified.

“The canoe?” asked Kimi.

“The what?”

“This is called a ‘canoe’,” said Kimi, slapping the side of the canoe.

“Okay, so, what is it?”

“You use it to cross the water,” said Kimi. She wasn’t sure what else to say about the concept of canoes.

“What are you doing?” asked the stranger, giving up on getting more information about the canoes.

“We’re exploring,” said Kimi. “The last people we encountered weren’t very nice,” she added.

“Uh, were they the <unrecognizable word>?”

“The what?” asked Kimi.

“Did you encounter them over there?” asked the stranger, gesturing in the direction Kimi and Jerilyn had just come from, and snapping to give them good enough resolution to tell where he was pointing.

“Yes,” said Kimi.

“What the <unrecognizable> were you doing over there?”

“Uh, we didn’t know not to go there.”

“Uh, well now you know. Good thing you survived. Where are you from?” asked the stranger.

“Elsewhere,” said Kimi, knowing the name of their island wouldn’t mean anything to them.

“Uh-huh. Hey, do you guys need any supplies, like food or anything? We’d be happy to help out if you show us how the canoe works,” said the stranger.

“That would be gr-” Kimi started.

“Kimi, no,” Jerilyn interrupted, “They want to steal our canoe.”

They were both getting hungry, but they’d have trouble getting back home without their canoe. It wasn’t worth the risk. They kept going. They were not pursued.

*

It was a long time before they found land again. When they did find land, it wasn’t an island separated from the old world by water like the others had been, but instead, the rock wall separating them from the old world flattened out to become navigable by foot. They were ravenous, having serious regrets about having ventured so far without food, and on the verge of turning back. So they were quite gratified when they smelled vents. They pulled their canoe onto the shore, located the vents, and gorged themselves on ventmoss. Their hunger sated, they noticed they were getting quite tired, and they went to sleep.

When they awoke, they decided to explore the new land they’d found. They walked inland for quite some time without finding another shore; they’d never imagined a land so vast before. Eventually they became tired again, gave up on finding water on the other side, and turned back. They lost track of the exact route they had taken, and when they reached the shore again, it wasn’t familiar territory. A gust of wind carried a faint smell of vents towards them, and, guessing that it was from the same vents they had found earlier, they followed the shore in the direction the wind had come from. This guess turned out to be correct, and they found their canoe right where they’d left it. They ate some more ventmoss, drank from the water, and rested for a while.

Then they decided to venture uphill, in the direction of the old world; perhaps they would be able to walk on top of the rock ceiling of the old world. The ground gradually steepened, and they kept going long past the point where they had to crawl on all fours, and each step brought them more up than forward. At times, they had to rely on their voices for echolocating footholds when their hands were occupied clinging to the rock and they couldn’t snap. When they turned back, it was due to some combination of the steepness spooking them, and them getting quite tired. They downclimbed facing backwards until the ground had flattened out enough that they could walk upright without falling over, and then they walked their way back to their canoe and the vents, had another meal, and went to sleep.

When they woke up again, they decided to return home. They followed the route they had taken last time, but steered clear of any signs of civilization. When they neared the place where they’d met the people who’d attacked them, they stayed very close to the rock wall separating them from the old world, paddled slowly, and instead of snapping, frequently gently tapped the rock next to them for guidance, in hopes of minimizing noise and not advertising their presence.

When they reached approximately the place where they had first surfaced into the new world, they had some trouble figuring out exactly the right place. Kimi periodically dove into the water with the sonar rods, and in most places, it was easy to tell that they couldn’t be in the right place because the rock extended down too far vertically into the water. But eventually they found a point below them where the rock didn’t extend as far down, and theorized that that might be their route home.

Jerilyn dove below the edge of the rock, rang the sonar rods, and sure enough, there were their buckets of air on the underside of the ridge. She went back to the surface, and they tied their rope to their canoe, filled the canoe with water, and pushed it under while Jerilyn held the rope. They surfaced again after pushing the canoe down a ways underwater, took deep breaths, dove all the way under the ridge until they got their heads in air buckets, and pulled the canoe further down by reeling in the rope until the canoe was below the ridge. Then they dumped the air out of their buckets and carried buckets and rope back to the other side, with a quick stop at the bucket they’d placed midway to take breaths and pack up that bucket too. They were desperate for air by the time they finally surfaced on the other side.

After they finished panting for breath, they reeled in their canoe, and laboriously emptied the water out of it, righted it, and went home. Their parents were delighted to see them, cross at the prolonged absence, and skeptical of their tales of the new worlds they’d discovered.

*

Some time later, Kimi and Jerilyn decided to make another expedition to the new world and try to climb further up the steep cliffs they’d found. Realizing that it would take a long time, and they’d want water and food other than ventmoss, they packed some dried fish and plenty of buckets, and fashioned some seals for their buckets so that water could be stored in them without spilling when jostled around.

They set out along the same path as in their previous expeditions, although it took them some time to find again the indentation in the rock where they’d crossed over into the new world. Once they did, they repeated their usual procedure to get to the other side, after tying their extra buckets (two containing dried fish sealed inside) to the canoe, since carrying the extra buckets underwater themselves would have been too unwieldy.

Once they reached the air on the other side, reeled in their canoe, righted it, and emptied the water out, they took a break to catch their breath. They then continued roughly in the same direction as their previous journey, with a detour to steer clear of signs of civilization before they rejoined their original route, which they successfully stayed on from then on, making for a lengthy but uneventful trip to the place they had landed at on their previous trip.

Unlike on their previous trip, they were not hungry when they reached the land, as they had been snacking on dried fish the whole time. But they were quite tired, so they went to sleep before going any further.

*

When they awoke, they filled the remaining space in their buckets of dried fish with ventmoss, filled two other buckets with water, and took off uphill, each carrying a food bucket over one shoulder and a water bucket over the other. They kept going past where they had turned back the previous time, and not long after, had to backtrack a bit because the route seemed too precariously steep. But after a little exploring, they were able to find a more navigable route up.

After a long ascent and many quick breaks, they decided they needed some sleep. Unfortunately, they were on very steep ground. However, after a bit of exploring, they managed to find a crevice of flat ground big enough for both of them to lie down in, and they went to sleep.

They continued their ascent when they awoke. At one point, Jerilyn, who was in the lead, slipped and fell on a steep stretch. Fortunately, she did not hit Kimi on the way down, and was not far above some flatter terrain on which she managed to stop her fall. Miraculously, the seals on both of her buckets had held.

Kimi downclimbed to join Jerilyn, and asked if she was alright. Jerilyn reported that while she would probably develop some bruises from the fall, she was otherwise undamaged. They looked around for a safer way up, eventually found one, and continued on.

The hill eventually flattened out considerably, and they were able to consistently walk upright without their hands on the ground, though still uphill. The rock ceiling got progressively lower, to the point where it wasn’t far above their heads. In places, they even had to duck under it, though there were also places where the rock ceiling was much higher. At one such point where they rock ceiling was anomalously high, they saw a few small points of light above them at an angle, and in that particular direction, the rock ceiling was further away than they could echolocate, if it was there at all.

Eventually they grew tired, and went to sleep again. When they awoke, they noticed that the ground quite a ways behind them was glowing brightly. The air in a line connecting the rock ceiling to that patch of ground was also glowing faintly. They walked towards it, but the glowing patch narrowed and disappeared before they reached it. They turned back uphill and pressed on.

Later, they saw another glowing patch of ground, again with accompanying faintly glowing ray of air shooting up to the rock ceiling, well to their right. They headed towards it, but it too narrowed and disappeared before they reached it, and they turned back uphill.

The rock ceiling narrowed further, and they had to crawl to keep going. On multiple occasions, the rock ceiling come so close to the ground that they could not go further, or even merged with the ground, becoming a wall in front of them, and in such cases, they had to backtrack and find a different route up. At one point, the only route forward was so narrow between ground and ceiling that, in order to get through them, they had to take their buckets off their shoulders and push them ahead, and advance while lying flat. Kimi, being smaller, had an easier time of this, and at one point, Jerilyn got stuck, but Kimi was able to turn around in a slightly wider spot just ahead and give Jerilyn a hand, helping her get through.

The ceiling rose further above them again, eventually to the point where they could walk upright without ducking. They saw a patch of little points of light ahead of them, and they went in that direction, which required a steep climb. As they drew close, it became apparent that the points of light were coming from a hole in the rock wall, as echoes bounced off rock to every side of the patch of points of light, but not from the patch itself.

They passed through the hole. Though the ground continued to stretch out before them in all directions, there was no longer any wall to either side or in front of them, nor a ceiling above them, as far as they could tell from the echoes of their snaps. There was an almost-vertical wall behind them surrounding the hole, but rather than bending above them into a ceiling as it rose, it bent back in the other direction, as if to form high ground after flattening out further beyond their hearing range. There were many little points of light in every direction above them. There was one big source of bright light, almost a disk, but with one side blunted slightly inwards. There was faint light pervading through the air, so that they could see things in their immediate vicinity, including each other, clearly, despite the bright lights being far above them, and they could even see geological features much farther away than they could hear.

“We found it!” said Kimi, “The place from the legend! Look, there’s the bright light Mom told us about!” She pointed to the big almost-disk of bright light above them.

“Yeah,” said Jerilyn, “It doesn’t hurt to look at, though. And Mom didn’t mention all the other lights. Still, considering it was an ancient myth, it did turn out to be remarkably accurate. That sure is a lot of light.”

They explored the new wide open land, snapping as they went to echolocate the ground, even though they could see it just fine, since they were not accustomed to using light to find their footing. They quickly discovered that it was far larger even than it had first appeared. For instance, they set off in the direction of what appeared to be a patch of vegetation low to the ground, which they could see but not echolocate, but the vegetation seemed to grow larger but draw further away as they approached, not coming within echolocating range until well after they expected it to.

On their way, they heard a burbling sound, and investigating, they found a trail of fast-moving liquid flowing across the ground. Kimi tapped the surface of the liquid hesitantly, then cupped her hands, plunged them under the surface, and brought some of the liquid back up in her hands. It felt like water. She sipped it. It tasted like water. She reported her findings, and Jerilyn followed suit, and concurred. They had never come across such a wide stretch of such fast-flowing, shallow water before. It was a fortunate find, as they had been running low on water, and would have had a hard time on their way back if they hadn’t found more water. The water was fresher than the water in their buckets, so they refilled their buckets with it.

By the time they finally reached the patch of vegetation they’d been headed towards, it became apparent that the vegetation, which they had initially thought to be low to the ground, was actually enormous, with thick stalks extending far over their heads, high enough to extend well past the rock ceiling from home, and branching out, with vegetation covering the branches far above them.

A similar phenomenon occurred when they headed for some small hills in the distance. Again, the hills seems to draw further away as they approached. But unlike the vegetation, the hills did not also seem to grow as they approached. They pursued the hills longer than they had pursued the vegetation, but the hills still seemed no closer, and their size hadn’t changed. They speculated that perhaps the hills were simply illusions, or perhaps they were vastly further away than the vegetation had been. They were tired. They found a good spot to lie down, and went to sleep.

*

Something was wrong. Kimi opened her eyes and screamed, waking Jerilyn, who also screamed. There was light everywhere. So much light, as if an anglerfish’s lure was right in front of their eyeballs, except that it was coming from all directions.

They quickly identified the source of the light: an inconceivably bright light coming from the ground in the distance, which, true to the legend, it hurt to look at. They turned away from the light, held each others’ hands, and took deep breaths to calm themselves down while they got used to the incredible quantity of light all around them. Jerilyn speculated that, since this light was so bright it hurt to look at, and was located far away on the ground, and the light they’d seen before was merely bright and located far up above them, that perhaps the light that their Mom had spoken of in the myth, which was supposed to be painfully bright and high up above them, was a conflation of the two lights that they’d seen.

Once they’d calmed down a bit, they kept exploring. The bright light slowly climbed above the ground and into the air, which Jerilyn noted meant she was probably wrong in her earlier speculations. At the same time, the light kept gradually getting even brighter, to the point where it hurt to look in any direction at all, and, counterintuitively, it actually got harder to see as the intensity of light increased. The novelty of so much light flooding their surroundings wore off quickly, so they ended up spending a lot of time with their eyes closed, but their ability to see farther than they could echolocate was useful for navigating, so sometimes they would squint or partially cover their eyes instead.

Eventually, Kimi noticed that her skin hurt. She remarked on this, and Jerilyn noticed that her skin hurt as well. There was no obvious cause to their ailments. Jerilyn speculated that, since they’d been fine before the light got so bright, and the skin under their clothing didn’t hurt, that perhaps the light was hurting their skin. They decided to try getting out of the light.

They found some more of the tall vegetation, which was dense enough to block much of the light from coming under it. They took a break under it. It was generally more pleasant there, as it was cooler (it had been warm earlier), and the reduced level of light didn’t hurt their eyes as much.

Their skin kept getting worse, though. This gave them some doubt over whether it was the light that was hurting their skin, but it still seemed possible that it was because of the light, and their skin was continuing to hurt because of damage already done. And they didn’t have any better ideas than staying there; the hole in the rock that they had emerged from was far away, and they didn’t feel like making their way back to it in all the light, in case it was the light that was hurting their skin. They had no guarantee that the light would go away, but since it had been much dimmer earlier, that gave them some hope that it would dim again.

Kimi began to cry out of some combination of fear and the pain of her skin. Jerilyn tried to comfort her, though her skin also hurt, and she was also concerned. They waited there a long time without the light going away. They were exhausted, as they had been woken up by the light well before they would have woken up on their own, but they also couldn’t get to sleep because of the light, stress, and pain in their skin. Their skins were growing blisters, and they were losing hope that the light would go away any time soon, so they were considering making their way back to the hole, when they noticed that the source of the light was slowly making its way back towards the ground. They decided to wait for it to get there to see what would happen.

The light slowly dimmed as the bright light drew close to the ground, and Kimi and Jerilyn took off for the hole they’d emerged from. They’d gotten used to the way that their surroundings would seem to grow and draw away as they approached, and they were able to use landmarks they recognized by sight to navigate back to the hole. They refilled their water buckets again when they reached the fast-flowing vein of water. There was plenty of vegetation and wildlife around them, and they speculated that some of it might be edible, and they had gone through well over half their food, so it was tempting to attempt to restock on food for the return trip, but they didn’t know how to determine what was edible, as they didn’t recognize any of it. Jerilyn was concerned that, since light seemed to be toxic to their skin in high doses, perhaps consuming vegetation that had been exposed to that much light might also be toxic to them (she realized later that this could also be an issue with the water that they’d found, but it wasn’t like they could just not drink water, so it was a risk they’d have to take). They had to make do with the food that they’d already packed.

The bright light in the sky was long gone, and, following its departure, the ambient light continued to dim. By the time they reached the cliff, the ambient light had returned to the level it had been at when they’d emerged, and they could see the little points of light, and the one big light in the sky that had so impressed them when they’d first seen it, but no longer seemed so grand, in comparison to the much brighter light that had replaced it for a time.

They found the hole that they’d emerged from, walked in, and retreated inwards quite a ways from the hole before they collapsed on the ground, exhausted, and slept.

*

Their skins were still painful and sensitive when they awoke, and the hole they’d traveled through was glowing intensely.

They continued on their way back home, but when they got to approximately the point where they thought the narrow spot they’d crawled up through was, it took them a long time before they found it. Crawling back through it was quite painful, as it was impossible to climb through without scraping their sensitive skin. But after some painful struggle, they made it through.

Their progress down was much slower than their progress up had been, both because of their skin sensitivity slowing down their crawling, and because it was difficult to retrace their steps. Recognizing this, they rationed their food and water so that it would last long enough. During the phase of their journey where they had to crawl under a low ceiling, they seemed hopelessly lost for a long time before they finally made their way to an area where they had enough room to stand up, and in that more open area, they were eventually able to find what seemed to be their previous path. Satisfied that they were no longer lost, they went to sleep before continuing.

They had an easier time following their route up from then on. Despite their skin pain and weariness slowing them down, they actually exceeded their pace from the way up on the flatter portions of the trip, but they lost that extra time on the portions where they had to climb. Despite their efforts to conserve food, they ran out by the next time they stopped to sleep.

After descending further for quite a long time, they ran out of water, but they realized that they were getting close to the water, the vents, and their canoe. They desperately needed more sleep, but they needed food and water more, so they pressed on. They were quite relieved when they finally reached the bottom. They drank from the water, sated their hunger with ventmoss, and went to sleep. When they woke up, they got in their canoe and set off for home.

The Knot

George and I were afraid we might be late for our meeting with the wizard, and not wishing to keep them waiting for us, we rushed there. Just before we arrived, I checked my phone and saw that we were two minutes early. I apprehensively prepared myself to knock on the door, but it swung open before I did so, revealing an impressively cluttered office. There was no one inside. George and I looked at each other.

“Do you suppose we should go in?” George asked.

“I don't think the door would have opened if we weren't supposed to,” I said. After some hesitation, I stepped inside, and George followed. The door closed behind us, causing us both to reflexively turn back towards the door. I tried the door handle, and found that the door offered no resistance to being opened again. Reassured that we weren't trapped, I closed the door again, to return it to the condition that I presumed the wizard preferred it in.

I noticed a loop of thin red rope hanging over the doorknob. There were no ends tied together in a knot. I picked it up to look for where the ends had been fused together, but could not find any joint; the rope appeared to have been constructed in a perfectly homogeneous circle.

I placed the loop of rope back around the doorknob, and turned my attention to the other objects filling the room. There was a perfectly spherical orb sitting on the wizard's desk. The orb had a cloudy appearance, and the clouds drifted aimlessly on the surface of the orb, despite the orb otherwise appearing to be solid. There was a shelf on a wall, holding an old oscilloscope, a set of the five platonic solids, each made out of smooth black material, and a beaker that held a liquid which was dancing around violently, but never spilling out of the beaker despite appearing to come close very frequently. There was a fireplace in the corner with a fire, but the only material in the fireplace was a bird sitting in the middle of the fire, but the bird wasn't burning, and it looked like it was sleeping. The bird raised its head to look at us quizzically, and then went back to sleep. I heard a faint popping sound, which I soon figured out had come from the liquid in the beaker. There was a bookshelf, completely packed with books, covering an entire wall, and there were also a few open books and many loose sheets of paper covering the wizard's desk, as well as a few sheets of paper that had fallen to the floor. Very few of the papers I saw were in English, and most weren't in any script I recognized. Some didn't appear to have any writing at all on them, consisting only of cryptic diagrams.

I noticed a strand of rope sticking out from under some papers on the wizard's desk. It appeared to be made out of very similar material to the loop of rope on the doorknob, except that it was green instead of red. I carefully moved the papers that were on top of it out of the way so I could see the rest of the rope. Like the rope hanging on the doorknob, it formed a closed loop. There were three points where the rope crossed over another part of the rope. The crossings alternated, in the sense that if you started at any crossing, and followed the strand on top around the loop, it would lead to the bottom strand of the next crossing it encounters, and then the top strand of the third crossing, and then the strand going under the point where you started, and so on. Only part of the rope was the green I had initially seen, another stretch of rope was red, matching the loop of rope on the doorknob, and part of it was blue. The rope was arranged so that the three points where the colors changed were hidden under the crossings.
I moved the portion of red rope that crossed over the boundary between green and blue, so that I'd be able to see the point where the color of the rope changed from green to blue. To my surprise, the piece of rope that I had just uncovered was solid blue all the way up to the new point that the red strand crossed over it. George asked me how it had done that, but I didn't know, and I ignored the question. I wiggled the red strand some more, but the portion of the rope it was moving over kept changing between blue and green so that the color switch always occurred exactly under the red strand. I tried holding the red strand in place and pulling the green strand under it, but again blue rope turned green just as it emerged out from under the crossing. I lifted the red strand into the air, and moved my head around to look under it from both directions. The color of the lower strand shifted in unison with my head, so that I never caught a glimpse of the boundary between the colors. I wiggled the strands going over the other two crossings to see if they would exhibit the same phenomenon, and they did. I paused for a moment to stare at the rope in confusion, and then picked up a piece of green rope and moved it over the blue portion of the rope, forming two additional crossings. Blue rope turned red as the green strand passed over it, forming an additional stretch of red rope in the middle of the blue part of the rope, again with the color change happening precisely under the crossings. Next I tried moving the green strand over the point where the blue strand crossed over the boundary between red and green. As I had anticipated, the stretch of rope going over the crossing turned from blue to red as the green strand passed over it, and an additional short stretch of blue rope had formed out of the red rope coming out from under the crossing, with all color boundaries being hidden behind other stretches of rope. I returned the loop of rope to its original configuration, and then tried twisting part of the blue portion of the rope, so that it crossed over itself. This did not cause any color changes, and I undid the twist.

“Hey George, I want to try something. Can you go around to the other side of the desk for a minute?” I said.

“Are you sure the wizard will be okay with us messing with his stuff like this?” George asked.

“I'm sure it'll be fine. Come on,” I said, pushing George in the intended direction. I actually had no idea whether or not the wizard would mind, but my curiosity had won out over my fear of offending the wizard. George walked around to the other side of the desk as I had requested.

“Okay, now look closely at this crossing,” I instructed, pointing to where the green stretch of rope passed over the boundary between the red and blue strands, which we were looking at from opposite sides. I crouched so that I was looking at the knot from a shallower angle, and George followed my example. I lifted the green strand going over the crossing up in the air. I was looking at the crossing from the side that the red strand was coming out from, and the blue stretch of rope coming out the other side appeared to turn red as the green rope passed in front of it in my field of vision.

“What's it look like?” I asked.

“The rope under the green strand is now blue up until the point where it crosses behind the strand,” he said. I put my finger on the red rope directly under the green part I had lifted.

“So this looks blue?” I asked.

“Yeah,” he said.

“So you can see my finger touching a blue stretch of rope?” I asked.

“Yeah, that's what I said,” George confirmed. I stood up and bent over to look at the rope from above, and pressed the green strand I was holding into my face running vertically between my eyes, so that I could see the piece of rope crossing under it from opposite sides of the green strand with each eye. It was a purple blur that could have been the result of red light reflecting off the rope into my right eye and blue light reflecting off the rope into my left eye. I unfocused my eyes so that the stretch of rope I was looking at would appear in different places in my field of vision in each eye, and indeed, it appeared as separate red and blue strands.

Suddenly remembering the loop of rope on the doorknob, I dropped the rope I was holding and went to go get it. George walked back around the desk to the side facing the door. I returned with the red loop of rope and held it over the rope on the table. The green and blue portions of the rope that I could see through the red loop had switched colors, while the red portion of the rope on the table was not changed in appearance by viewing it through the red loop. I lifted part of the rope on the table, and slid the loop of red rope under it. The loop was no longer red all the way around, with color changes whenever it passed under a strand of rope of a different color. I grabbed the formerly red loop of rope by a blue stretch in the center of the loop of rope on the table, and pulled it out. I was holding a solid blue loop of rope. I put the blue loop of rope aside, took out my phone from my pocket, and opened the camera. I lifted the green strand and put my phone under it to take a picture of the spot where the rope crossing under it switched from red to blue. The camera image on the screen showed the strand changing from red to blue right under the spot where the green strand crossed over the phone, so that the boundary between red and blue wasn't visible on the screen. I took a picture, and then moved the rope out of the way so that I could see the spot where the color changed. But the picture I saw on the phone screen was of a completely red strand of rope. I moved the phone back under the green strand, and saw that the still image of a strand of rope in my camera was changing from red to blue as I moved the green strand over it. I pulled the phone back out the other side of the green strand, and it bore an image of a completely blue strand of rope. I closed the picture so I could take another one. The image of the knot in the phone screen looked the same as the actual knot, except that the colors red and blue were switched. I put down the phone, and pulled a pen and small notebook out of my pocket. I tore off a page of the notebook, and wrote on it the current color of the loop of rope I had taken from the doorknob (blue). I folded up the piece of paper, slipped it under the multicolored loop of rope with the crossings, and pulled it out through the center. I unfolded it, and found the word “green” written on it, in my handwriting, instead of the “blue” that I had written. I picked up my phone and called a friend.

She picked up, and before she said anything, I said, “Hi Kelly. Pick a color. Red, green, or blue?”

“Blue. Why do you ask?” she said.

“I'll explain later. Thanks. Bye,” I said, and hung up. I wrote “blue” under the word “green” on the piece of paper, folded it back up, and slipped it under the knot and pulled it out through the center, as I had done before. I unfolded it, and saw that the word “green” that had been near the top of the paper had turned into “red”, while the word “blue” that I had written when Kelly picked it had remained unchanged. I also noticed that the pen I was using had blue ink, and the color of the ink on the page had never changed. There were a couple more things I wanted to try. I thought through what I was going to do, and then called Kelly back.

“Can you pick a color again? Same options,” I asked.

“Red,” said Kelly.

“Thanks,” I said, and hung up. I lifted part of the knot into the air and stuck my right hand under it, so that my hand was sticking out through the center part of the knot. The plan was to hand the phone from my left hand to my right hand, and then pull it with my right hand back from under the knot, except that if Kelly had named the current color of the loop of rope that had been on the doorknob, I would only go through the motions of this without actually holding the phone. The loop of rope from the doorknob was blue, and Kelly had said red, so I kept the phone in my left hand as I moved my left hand towards my right, and I attempted to grasp the phone with my right hand. But while I saw my right hand grab the phone, I felt my fingers pass through thin air where I saw the phone. I withdrew my right hand out from under the knot, and while the phone was definitely pulled out of my left hand, and I saw my right hand holding the phone as it receded, I felt my right hand in a fist closed around nothing. As my hand passed out from under the knot, the fist became visible and the phone seemed to disappear. George gasped, as this was the first sign visible to him that anything was amiss.

“Where's your phone go?” he asked.

“I don't know. In retrospect, I probably shouldn't have used my phone for that. At least we've still got your phone if we want to try taking more pictures,” I said. I felt rather foolish, as I had actually identified this outcome in advance as consistent with previous observations, but somehow hadn't seriously considered the possibility that it would actually happen.

“You just managed to lose your phone in the magic rope. I'm not letting you touch mine,” said George. He had a point. I thought about how I might get the phone back, but couldn't think of anything, and besides, there was another experiment I'd been going to try. I reached for the red strand of rope (chosen because it was the color that Kelly had picked), but before I touched it, it started receding under the green strand, as if the blue strand on the other side was being pulled, but the blue strand itself was motionless, and rather than turning blue as it came out from under the green strand, the red rope would simply vanish as it passed under the green strand, leaving a significantly shortened stretch of red rope by the time this stopped. The point where the red strand disappeared under the green was no longer aligned with the point that the blue strand came out from under the green on the other side. I grabbed the red rope near where it crossed under the blue strand and pulled. More red rope came out of nowhere so that the red strand still continued all the way up to where it disappeared under the blue strand, even as I pulled it away, just as if the green strand on the other side were passing under the blue strand and turning red, but the green strand itself did not move. The point where the red strand passed under the blue strand and vanished also became misaligned with the point where the green strand emerged out the other side. When I stopped pulling on the red strand, there was about the same amount of red rope visible as there had been before some of it had vanished under the green strand.

“Hello, folks. Sorry I'm late,” came a voice from behind us in a heavy accent that I didn't recognize. George and I turned around and saw someone of unidentifiable gender in robes and a pointy hat, carrying a wooden staff with a hexagonal piece of metal attached to the side and a shiny truncated octahedron fastened to the top, and wearing a ring on each of their ten fingers, each in a different style. The door was closed behind the wizard. I hadn't heard it open or close. The wizard's eye caught the knot of rope on their desk.

“Oy, the bloody thing's out of sync again,” they said, and walked over to the desk, put the staff down leaning against the desk, pulled a wand out of their robes, and jabbed their wand at the knot. They put their wand back in their robes, picked up the knot of rope, and threw it up in the air. When it landed back on the desk, the strands were perfectly aligned with each other again.

“There we go,” said the wizard. They picked up their staff and gestured with it towards a wall, out of which lept two folding chairs, which positioned themselves in front of the wizard's desk and unfolded into chairs that did not look the least bit like folding chairs.

“Have a seat,” said the wizard, indicating the chairs. I put my phone back in my pocket, and sat down.

Why The Apple Falls

Like many children do, when my son Isaac was a little boy, he once asked me “How do people on the other side of the world stay up there? Wouldn't they fall down here?” So of course I explained that everything always falls toward the ground, even though that's the opposite direction on the other side of the world.

Isaac thought about this for a moment, and then asked, “What if you were halfway between here and the other side of the world? Which way would you fall?”

“As I said, you always fall towards the ground,” I told him, “So if you went East until you were halfway to the other side of the world, you would fall towards the ground there, which is the direction you'd call down once you were there, even though in a way, it's the direction we'd call East here. And if you went West, you'd still fall towards the ground there, which-”

“No!” he said, cutting me off, “What if you went that way,” he pointed straight up, “until you were halfway to the other side of the world?”

“You mean where the sun is?” I asked.

“Yes. What if you were holding onto the sun, and you lost your grip? Which way would you fall?”

There were so many things wrong with that question, I wasn't sure where to begin. “How would you get there?” I asked.

“I don't know, maybe you could use a ladder,” said Isaac.

“You can't balance a ladder going that high.”

“You could make a ladder long enough to go all the way to the other side of the world, prop both ends against the ground, and climb to the middle,” he said. It took me a while to figure out what he was getting at.

“Ok, but you can't even build a ladder that big.”

“But what if you did, Daddy? If you built the ladder and grabbed onto the sun and let go, which way would you fall?”

“You'd probably go blind from getting too close to the sun first,” I said.

“You could do it at night.” I had to admit he got me there. I was going to object that you probably still couldn't actually hold onto the sun, but I decided against it. I could see what I was supposed to say.

“Well, Isaac, I guess you'd fall towards the ground.”

“You mean away from the sun?” he asked.

“Yeah, away from the sun.” I was surprised he needed that clarification.

 

It's cute when little kids do it, but Isaac never gave up his habit of asking stupid questions. He's a great hunter, and he'll make a fine warrior too, but I can't say he's smart.

One day, when he was a teenager, he asked me, “Dad, why is it easier to throw a spear West than to throw it East?”

“Probably because you were throwing it on the Western face of a hill,” I said.

“I thought of that,” he said, “but it's like that everywhere. I even tried some target practice when we went to Brythsville, and that's almost on the other side of the world. East can't be uphill all the way around.”

“Maybe the wind was blowing West each time you tried?”

“No, the wind shouldn't affect a spear much. Besides, I've noticed the same thing in calm wind, and when the wind isn't blowing West.”

“Then you're probably imagining things. It can't be easier to throw a spear West than East on flat ground. That doesn't make sense,” I told him.

“I know it doesn't make sense, but I'm not imagining it. It's very consistent,” he insisted. We kept arguing about it for a while, and he kept rejecting all of my proposed explanations, but wouldn't let go of the idea that it was still easier to throw a spear West than East.

 

A few days later, as I was walking home through the village square, I head Isaac's voice shout “Dad!” from the top of the clock tower. I looked up, and saw him perched on top of the clock tower with his friend Emmy and a bucket. Emmy waved.

“How'd you get up there?” I asked.

“Watch closely,” Isaac said, ignoring my question, and he poured a bunch of pebbles out of the bucket.

“What?” I asked, after the pebbles had all hit the ground.

“You didn't see them curve?” he asked.

“Curve? No.”

Isaac and Emmy climbed down the clock tower. “If you look closely, the pebbles curve a bit to the West as they fall,” Isaac said.

“It's probably just the wind,” I said, as Isaac and Emmy started picking up pebbles and putting them back in the bucket.

“The wind's pretty calm right now,” said Isaac. He was right. “Besides, exactly the same thing happened when we poured the pebbles inside the clock tower. Let me show you.” Isaac started climbing back up the clock tower with the bucket slung over his shoulder. Emmy led me inside the clock tower, and started explaining what was going to happen. The clock tower had no roof, so there was plenty of light. She pointed out that there were visible vertical lines on the walls formed by the edges of every other brick, and explained that since not all the pebbles were going to fall from the bucket at the same time, you could compare the positions of the highest pebbles to the positions of the lowest pebbles to see a line tracing out the path formed by the pebbles, and that it was going to curve slightly to the West, enough to be visible once the pebbles got near the ground.

Isaac reached the top, and started slowly pouring the pebbles out of the bucket. Sure enough, they followed exactly the path Emmy had said they would, curving just a tad to the West.

“Huh, you're right,” I said. Isaac started descending the tower.

“You see?” Isaac said, “There must be a small force pulling everything just a little to the West all the time, and it's usually too small to be noticeable unless something is in the air for long enough. That's why it's easier to throw a spear West than East-”

“Oh, not this again.”

“Because the spear is being pulled West. So in a way, it is kind of like East is uphill all the way around.”

“That's ridiculous!” I said, “A mysterious force pulling everything everything West?” I jokingly pretended I was being pulled involuntarily to the West, and screamed, “Aaaaaaahhh!” before ending the act and laughing.

“No, it's just so small that it can't pull you over when you're standing, and you don't usually notice it,” he insisted.

“Still, it makes no sense for everything to move the West mysteriously for no reason,” I said.

Isaac started grinning. “You're wrong,” he said. He picked up a pebble, dangled it out in front of himself, and dropped it. He paused for dramatic effect while I wondered what he was getting at. “I didn't push that pebble down,” he said, “It just mysteriously moved downwards for no reason.”

“Yeah, it fell. Things fall down.”

“Exactly! If things can be pulled downwards without anything touching them, why can't they be pulled a little bit West without anything touching them?”

“Ha! Well, if everything's getting pulled West, do spears veer off to the West when you throw them North or South?” I asked. I saw the confidence disappear from his face.

“Also, if there's a force pulling everything West that's just like the force pulling everything down except weaker, why would the pebbles curve to the West? Wouldn't they just move in a straight line that's sloped a little bit to the West?” Emmy added. Isaac looked like he was about to answer this, but then stopped, like it took him a moment to realize that he didn't have an answer.

“Nothing's getting mysteriously pulled West,” I said, “it's probably just that this tower is skewed a bit, so it looks like the pebbles move West when they actually fall straight down, just like everything always does. That's all.”

“Why do they curve, then?” Emmy asked.

“They probably don't,” I said, “They just went by fast enough that it was hard to tell exactly what the path looks like, and we tricked ourselves into thinking it was curved.” I couldn't believe I'd briefly bought into that nonsense about the pebbles falling in a curved path.

“Nope, definitely curved. We all saw it,” Emmy insisted. I argued about it with her for a bit, while Isaac just stood around looking confused.

 

“The pebbles do go in a curved path,” Isaac said a couple days later.

“Huh?” I said. I hadn't been thinking about the events a couple days prior, so it took me a moment to figure out what Isaac was talking about.

“The pebbles that you said must fall in a straight line from the top of the clock tower,” Isaac said, “Emmy and I tested it more precisely by dangling a rope off the top of the tower, and comparing the path the pebbles fell to the rope. The pebbles landed West of the end of the rope when we poured them directly in front of the rope. And you had almost convinced me that we were imagining the curving earlier, but when you compare it to the rope, it's harder to deny. The path was definitely curved. Which is pretty weird, when you think about it. Like, why would the rope just dangle straight down while the pebbles curve to the West? I think it might be that things only get pulled West when they're moving. The rope is just hanging there, not moving, so it doesn't get pulled West. But the pebbles are falling, so they get pulled West, and once they've fallen farther, they've picked up more speed, so they get pulled West harder, which would explain why they're curved. This could still explain why it's easier to throw a spear West than East, since the spear is moving, and why we don't feel ourselves getting pulled West, since we don't move very fast. But you had a good point about throwing spears North or South. They don't curve to the West at all. So maybe the direction it's moving matters. Things moving down, East, or West get pulled West, but things moving North or South don't. This seems pretty strange. Why would it work that way? I'm curious what happens to things that are moving up, but I can't figure out how to find out. It's hard to throw something straight up, and also hard to see what it's doing once you do. I did think of one thing we could try which would be really cool, but I don't think we could get enough rope. If we could stretch a piece of rope all the way around the world, and then pull both ends of the rope, we'd lift the whole rope up into the air. Then, if moving up also makes things get pulled West, we'd see the whole rope rotate to the West.”

I found basically everything he'd just said pretty implausible. “There definitely isn't enough rope to do that,” I said.

“Yeah, I know. I was just saying it would be really awesome if there was. And I'd be able to find out what happens to things that move straight up,” he said.

 

“Things get pushed to the East when they move up,” Isaac told me the next day.

“What convinced you of that?” I asked.

“Emmy and I cut a hole in a piece of wood to thread a rope through, tied a brick to one end of a long rope, and dropped the brick off the top of the clock tower while the rope was threaded through the hole in the wood. The other end of the rope moved East of the hole by the time it got pulled all the way up to the piece of wood. It actually took us a while to figure that out, since the wood was blocking our view of the end of the rope from the top of the tower. When one person watches from the ground, it's hard to see what the end of the rope is doing all the way up there from the ground; each of us took a turn watching from the ground, and neither of us could tell whether the end of the rope moved. So we dipped the end of the rope in paint and tried it again. The paint all splattered to the East of the hole. How weird is that? Things moving East, West, or down get pushed to the West, things moving up get pushed East, and things moving North or South don't seem to get pushed at all. Why? What's the pattern there? It makes no sense!” Isaac seemed oddly incensed about this.

“You're right about one thing, which is that that doesn't make any sense,” I said, “It was probably just from the rope randomly fluttering around. You don't need to postulate some sort of mysterious force that notices when things are moving and pushes them off in some other direction in order to explain a simple paint splatter.”

“It wasn't random,” he said, “We repeated it several times, sometimes changing details like what direction we held the brick away from the wood before dropping it. The paint always splatters East.”

 

A few days later, Isaac was out doing some target practice with a spear, and when he came back, he said, “You know, it occurred to me, if moving East or West causes things to get pushed West, then if you throw a spear in a diagonal direction, it's partially moving East or West as well as partially moving North or South, so it should veer off to the West. Like, if you throw it Southwest, it should veer off to the right. But that's not what actually happens. It just goes straight.”

“Ha! I told you it was all a bunch of nonsense,” I said.

“I thought of a better explanation, though,” he said, “I think when things move West, they get pushed a little bit up, and when things move East, they get pushed a little bit down. That still explains why it's easier to throw a spear West than East, because it gets a little boost upwards when thrown West and gets pushed a bit harder downwards when thrown East. This also means that it should be a little bit easier to throw it Southwest or Northwest, but not as easy as throwing it West, since it's partially moving West, and thus should get a smaller boost, and similarly, it should be a little bit harder to throw it Southeast or Northeast, but hot as hard as throwing it East. I think this is what actually happens, but it's hard to tell, since the effect is pretty subtle. And it makes so much more sense this way. Anything moving gets pushed in the direction that's 90 degrees clockwise from the direction it's moving, from the perspective of someone facing North. It's a clear pattern. I just still don't get why, though.”

I told him that that didn't make any more sense than what he'd been saying earlier, and he was probably imagining things. But he seemed pretty convinced that his new version of the story was better, somehow, and he kept trying to get me to help him come up with an explanation for it.

He never let go of this idea that things get pushed clockwise from the direction they're moving from the perspective of someone facing North. Every few months or so, I'd think he'd finally forgotten about the crazy idea, and he'd suddenly bring it up again, usually asking my opinion on some inane question like whether it had something to do with why things fall down, or if something moving West fast enough would fall up, or whether something could keep moving around in a big circle, moving down fast enough that it gets pushed West enough that its Westward movement makes it get pulled up, and its upward movement then making it get pulled East, and its Eastward movement making it move back down again. (I answered “no, of course not” to all three of those questions.)

 

Years later, Isaac and some of the other young men were having a contest to see which of them could throw a large rock the farthest. They were taking turns spinning around with the rock to gain speed and then throwing it forward, at which point others would mark where it hit the ground, and then the next person would bring it back and throw it again. When it was Isaac's turn, the rock landed right next the marker for where the rock had fallen from the best previous throw. They decided that Isaac's throw was a little bit shorter. I told them it looked like a tie to me, but they ignored me, and Isaac came in second place.

I tried to comfort Isaac about his loss afterwards, repeating that it looked like a tie to me, and saying he made a really great throw.

“It's only partially a throw,” he said, “It's also largely just spinning around and letting go. Once you're spinning around, the only thing stopping the rock from flying away is the fact that you're holding onto it, so you just have to let go to send it flying.” There was a pause before he continued, “I've got a riddle for you.” (that's what he says when he's about to ask a stupid question that he says he knows the answer to.) “When you're spinning around while holding a rock, if you pull the rock towards yourself, you'll spin faster. Does that mean it's a good idea to pull the rock towards yourself before letting go, to give it some extra speed?”

“No, of course not,” I said.

“Right, but why not?”

“Because you're trying to throw the rock away from you, not towards you. You'd be pulling it the wrong way,” I said.

“Sure, but then why would you spin faster when you pull the rock towards you?”

“I don't see what that has to do with throwing the rock forward.”

“The rock gets thrown forwards because of the speed it built up from spinning around, so if you spin faster, you should be able to throw it faster. But a big part of the reason pulling the rock inwards makes you spin faster is that, since the rock is closer to you, a full circle around you is shorter, so the speed the rock already had would take it all the way around you in less time. So the fact that you're spinning faster doesn't necessarily mean the rock is moving faster. It isn't clear to me what the effect of pulling the rock inwards on its speed is. The rock could speed up anyway, because if the rock is moving inwards while you're pulling it farther inwards, then you're pulling it in the direction of motion, which should make it speed up. Put differently, when you're a quarter-turn before the point where you let go of the rock, then pulling the rock inwards is actually the right direction. On the other hand, pulling the rock inwards could also slow it down, because once you pull it inwards, the rock is spinning around faster than you are, so it ends up pulling you forwards, which means you're also pulling the rock backwards, slowing it down. I think the second effect is probably bigger, so pulling the rock inward slows it down overall. In any case, the appearance that the rock moves forward faster when you pull it inwards is largely illusory, for the same reason it feels like the rock is pushing outwards in the first place; the rock's just trying to keep moving in the same way it's been going, but that isn't maintaining the same position relative to you.”

“Hm,” I said.

Isaac stopped talking for a bit, and I tried making small talk, but he seemed distracted and would only give at most two-syllable replies, so I gave up. I thought it was because he was sad over losing the contest, and forgot that he always acts like that right before saying something really inane. Isaac looked up for a bit.

“I figured it out,” said Isaac.

“Figured what out?”

“We're spinning.”

“Spinning?”

“Yeah. The world is spinning. You know how if you're spinning while holding a rock and you let go, it moves away from you just because it keeps moving the same way it was before? Well check this out,” he said, and jumped up to grab an apple from a tree that we were passing under. He held the apple in front of himself and dropped it. “Let go of the apple, and it seems to move away from the sun, the center of the world. Because the world is spinning. The apple just kept moving the same way it had already been moving, and the ground rose to meet it, because the ground is constantly being pulled towards the center of the world. That's why it's easier to throw a spear West than East. The world is spinning to the East, so anything moving East is spinning faster, so it moves away from the center of the world faster; that is, it falls faster. And anything moving West is spinning slower, so it falls slower. If something is very high, it is closer to the center of the world, so it's moving slower to keep up with the rotation of the world. When it falls, it gets farther away from the center of the world, so the same speed isn't enough to keep up with the world's rotation anymore, and part of its speed is directed down instead of East. Both of these effects make it look like it's moving West, opposite the direction of rotation. And if something moves up, the same speed makes it rotate around the center of the world faster than everything else, and the speed that was added to make it go up ends up pointing East. Both of these effects make it appear to move East.”

I laughed, and pointed out that I couldn't see anything spinning, but Isaac just said that's because I'm spinning the same way everything else was, so nothing would look out of place. I countered that if I jump, I land right back where I started instead of West of where I started, so the ground couldn't be rotating to the East under me. He had an answer to that, believe it or not. He said I keep moving East too, so I stay right over the same point on the ground.

And that's the story of how my son Isaac became convinced that the world is constantly spinning around in a circle. I think that's nonsense. The world doesn't look like it's spinning, and I don't think we need to suppose it is to explain the simple fact that things fall down and some mysterious forces acting on moving objects in Isaac's imagination.

Metamathematics and probability

Content warning: mathematical logic.

Note: This write-up consists mainly of open questions rather than results, but may contain errors anyway.

Setup

I'd like to describe a logic for talking about probabilities of logical sentences. Fix some first-order language {\cal L}. This logic deals with pairs \left(\varphi,p\right), which I'm calling assertions, where \varphi\in{\cal L} is a formula and p\in\left[0,1\right]. Such a pair is to be interpreted as a claim that \varphi has probability at least p.

A theory consists of a set of assertions. A model of a theory T consists of a probability space \left(X,P\right) whose points are {\cal L}-structures, such that for every assertion \left(\varphi,p\right)\in T, P_{*}\left(\left\{ {\cal M}\in X\mid{\cal M}\models\varphi\right\} \right)\geq p, where P_{*} is inner probability. I'll write T\vdash_{p}\varphi for \left(\varphi,p\right) can be proved from T, and T\models_{p}\varphi for all models of T are also models of \left\{ \left(\varphi,p\right)\right\} .

The rules of inference are all rules \Gamma\vdash_{p}\varphi where \Gamma is a finite set of assertions, and \left(\varphi,p\right) is an assertion such that P_{*}\left(\left\{ {\cal M}\in X\mid{\cal M}\models\varphi\right\} \right)\geq p in all models of \Gamma. Can we make an explicit finite list of inference rules that generate this logic? If not, is the set of inference rules at least recursively enumerable? (For recursive enumerability to make sense here, we need to restrict attention to probabilities in some countable dense subset of \left[0,1\right] that has a natural explicit bijection with \mathbb{N}, such as \mathbb{Q}\cap\left[0,1\right].) I'm going to assume later that the set of inference rules is recursively enumerable; if it isn't, everything should still work if we use some recursively enumerable subset of the inference rules that includes all of the ones that I use.

Note that the compactness theorem fails for this logic; for example, \left\{ \left(\varphi,p\right)\mid p<1\right\} \models_{1}\varphi, but no finite subset of \left\{ \left(\varphi,p\right)\mid p<1\right\} implies \left(\varphi,1\right), and hence \left\{ \left(\varphi,p\right)\mid p<1\right\} \nvdash_{1}\varphi.

Any classical first-order theory T can be converted into a theory in this logic as \left\{ \left(\varphi,1\right)\mid T\vdash\varphi\right\} .

Löb's Theorem

Let T be a consistent, recursively axiomatizable extension of Peano Arithmetic. By the usual sort of construction, there is a \Sigma_{1}^{0} binary predicate \square_{y}\left(x\right) such that T\vdash_{p}\varphi\iff\mathbb{N}\models\square_{p}\left(\ulcorner\varphi\urcorner\right) for any sentence \varphi and p\in\left[0,1\right]\cap\mathbb{Q}, where \ulcorner\urcorner is a coding of sentences with natural numbers. We have a probabilistic analog of Löb's theorem: if T\vdash_{p}\square_{p}\left(\ulcorner\varphi\urcorner\right)\rightarrow\varphi, then T\vdash_{p}\varphi. Peano arithmetic can prove this theorem, in the sense that PA\vdash_{1}\square_{p}\left(\ulcorner\square_{p}\left(\ulcorner\varphi\urcorner\right)\rightarrow\varphi\urcorner\right)\rightarrow\square_{p}\left(\ulcorner\varphi\urcorner\right).

Proof: Assume T\vdash_{p}\square_{p}\left(\ulcorner\varphi\urcorner\right)\rightarrow\varphi. By the diagonal lemma, there is a sentence \psi such that T\vdash_{1}\psi\leftrightarrow\left(\square_{p}\left(\ulcorner\psi\urcorner\right)\rightarrow\varphi\right). If \square_{p}\left(\ulcorner\psi\urcorner\right), then \square_{1}\left(\ulcorner\square_{p}\left(\ulcorner\psi\urcorner\right)\urcorner\right) and \square_{p}\left(\ulcorner\square_{p}\left(\ulcorner\psi\urcorner\right)\rightarrow\varphi\urcorner\right), so \square_{p}\left(\ulcorner\varphi\urcorner\right). This shows that T\cup\left\{ \left(\square_{p}\left(\ulcorner\psi\urcorner\right),1\right)\right\} \vdash_{1}\square_{p}\left(\ulcorner\varphi\urcorner\right). By the assumption that T\vdash_{p}\square_{p}\left(\ulcorner\varphi\urcorner\right)\rightarrow\varphi, this implies that T\cup\left\{ \left(\square_{p}\left(\ulcorner\psi\urcorner\right),1\right)\right\} \vdash_{p}\varphi. By a probabilistic version of the deduction theorem, T\vdash_{p}\square_{p}\left(\ulcorner\psi\urcorner\right)\rightarrow\varphi. That is, T\vdash_{p}\psi. Going back around through all that again, we get T\vdash_{p}\varphi.

If we change the assumption to be that T\vdash_{q}\square_{p}\left(\ulcorner\varphi\urcorner\right)\rightarrow\varphi for some q<p, then the above proof does not go through (if q>p, then it does, because \left(\theta,q\right)\vdash_{p}\theta). Is there a consistent theory extending Peano Arithmetic that proves a soundness schema about itself, \left\{ \left(\square_{p}\left(\ulcorner\varphi\urcorner\right)\rightarrow\varphi,q\right)\mid q<p\right\} , or can this be used to derive a contradiction some other way? If there is no such consistent theory, then can the soundness schema be modified so that it is consistent, while still being nontrivial? If there is such a consistent theory with a soundness schema, can the theory also be sound? That is actually several questions, because there are multiple things I could mean by "sound". The possible syntactic things "sound" could mean, in decreasing order of strictness, are: 1) The theory does not assert a positive probability to any sentence that is false in \mathbb{N}. 2) There is an upper bound below 1 for all probabilities asserted of sentences that are false in \mathbb{N}. 3) The theory does not assert probability 1 to any sentence that is false in \mathbb{N}.

There are also semantic versions of the above questions, which are at least as strict as their syntactic analogs, but probably aren't equivalent to them, since the compactness theorem does not hold. The semantic version of asking if the soundness schema is consistent is asking if it has a model. The first two soundness notions also have semantic analogs. 1') \left\{ \mathbb{N}\right\} is a model of the theory. 2') There is a model of the theory that assigns positive probability to \mathbb{N}. I don't have a semantic version of 3, but metaphorically speaking, a semantic version of 3 should mean that there is a model that assigns nonzero probability density at \mathbb{N}, even though it might not have a point mass at \mathbb{N}.

Motivation

This is somewhat similar to Definability of Truth in Probabilistic Logic. But in place of adding a probability predicate to the language, I'm only changing the metalanguage to refer to probabilities, and using this to express statements about probability in the language through conventional metamathematics. An advantage of this approach is that it's constructive. Theories with the properties described by the Christiano et al paper are unsound, so if some reasonably strong notion of soundness applies to an extension of Peano Arithmetic with the soundness schema I described, that would be another advantage of my approach.

A type of situation that this might be useful for is that when an agent is reasoning about what actions it will take in the future, it should be able to trust its future self's reasoning. An agent with the soundness schema can assume that its future self's beliefs are accurate, up to arbitrarily small loss in precision. A related type of situation is if an agent reaches some conclusion, and then writes it to external storage instead of its own memory, and later reads the claim it had written to external storage. With the soundness schema, if the agent has reason to believe that the external storage hasn't been tampered with, it can reason that since its past self had derived the claim, the claim is to be trusted arbitrarily close to as much as it would have been if the agent had remembered it internally.

First Incompleteness Theorem

For a consistent theory T, say that a sentence \varphi is T-measurable if there is some p\in\left[0,1\right] such that T\vdash_{q}\varphi for every q<p and T\vdash_{q}\neg\varphi for every q<1-p. So T-measurability essentially means that T pins down the probability of the sentence. If \varphi is not T-measurable, then you could say that T has Knightian uncertainty about \varphi. Say that T is complete if every sentence is T-measurable. Essentially, complete theories assign a probability to every sentence, while incomplete theories have Knightian uncertainty.

The first incompleteness theorem (that no recursively axiomatizable extension of PA is consistent and complete) holds in this setting. In fact, for every consistent recursively axiomatizable extension of PA, there must be sentences that are given neither a nontrivial upper bound nor a nontrivial lower bound on their probability. Otherwise, we would be able to recursively separate the theorems of PA from the negations of theorems of PA, by picking some recursive enumeration of assertions of the theory, and sorting sentences by whether they are first given a nontrivial lower bound or first given a nontrivial upper bound; theorems of PA will only be given a nontrivial lower bound, and their negations will only be given a nontrivial upper bound. [Thanks to Sam Eisenstat for pointing this out; I had somehow managed not to notice this on my own.]

For an explicit example of a sentence for which no nontrivial bounds on its probability can be established, use the diagonal lemma to construct a sentence \varphi which is provably equivalent to "for every proof of \left(\varphi,p\right) for any p>0, there is a proof of \left(\neg\varphi,q\right) for some q>0 with smaller Gödel number."

Thus a considerable amount of Knightian uncertainty is inevitable in this framework. Dogmatic Bayesians such as myself might find this unsatisfying, but I suspect that any attempt to unify probability and first-order arithmetic will suffer similar problems.

A side note on model theory and compactness

I'm a bit unnerved about the compactness theorem failing. It occurred to me that it might be possible to fix this by letting models use hyperreal probabilities. Problem is, the hyperreals aren't complete, so the countable additivity axiom for probability measures doesn't mean anything, and it's unclear what a hyperreal-valued probability measure is. One possible solution is to drop countable additivity, and allow finitely-additive hyperreal-valued probability measures, but I'm worried that the logic might not even be sound for such models.

A different possible solution to this is to take a countably complete ultrafilter U on a set \kappa, and use probabilities valued in the ultrapower \mathbb{R}^{\kappa}/U. Despite \mathbb{R}^{\kappa}/U not being Cauchy complete, it inherits a notion of convergence of sequences from \mathbb{R}, since a sequence \left\{ \left[x_{i,j}\mid i\in\kappa\right]\mid j\in\omega\right\} can be said to converge to \left[\lim_{j\rightarrow\infty}x_{i,j}\mid i\in\kappa\right], and this is well-defined (if \lim_{j\rightarrow\infty}x_{i,j} is for a U-large set of indices i) by countable completeness. Thus the countable additivity axiom makes sense for \mathbb{R}^{\kappa}/U-valued probability measures. Allowing models to use \mathbb{R}^{\kappa}/U-valued probability measures might make the compactness theorem work. [Edit: This doesn't work, because \mathbb{R}^{\kappa}/U\cong\mathbb{R}. To see this, it is enough to show that \mathbb{R}^{\kappa}/U is Archimedean, since \mathbb{R} has no proper Archimedean extensions. Given \left[x_i\mid i\in\kappa\right]\in\mathbb{R}^{\kappa}/U, let A_n:=\{i\in\kappa\mid| x_i|<n\} for n\in\mathbb{N}. \bigcup_{n\in\mathbb{N}}A_n = \kappa, so by countable completeness of U, there is some n\in\mathbb{N} such that A_n\in U, and thus \left[x_i\mid i\in\kappa\right]<n.]

Complexity classes of natural numbers (googology for ultrafinitists)

Ultrafinitists think common ways of defining extremely large numbers don't actually refer to numbers that exist. For example, most ultrafinitists would maintain that a googolplex isn't a number. But to a classical mathematician, while numbers like a googolplex are far larger than the numbers we deal with on a day-to-day basis like 10, both numbers have the same ontological status. In this post, I want to consider a compromise position, that any number we can define can be meaningfully reasoned about, but that a special status is afforded to the sorts of numbers that ultrafinitists can accept.

Specifically, define an “ultrafinite number” to be a natural number that it is physically possible to express in unary. This isn't very precise, since there are all sorts of things that “physically possible to express in unary” could mean, but let's just not worry about that too much. Also, many ultrafinitists would not insist that numbers must be expressible in such an austere language as unary, but I'm about to get to that.

Examples: 20 is an ultrafinite number, because 20 = SSSSSSSSSSSSSSSSSSSS0, where S is the successor function. 80,000 is also an ultrafinite number, but it is a large one, and it isn't worth demonstrating its ultrafiniteness. A googol is not ultrafinite. The observable universe isn't even big enough to contain a googol written in unary.

Now, define a “polynomially finite number” to be a natural number that it is physically possible to express using addition and multiplication. Binary and decimal are basically just concise ways of expressing certain sequences of addition and multiplication operations. For instance, “18,526” means (((1*10 + 8)*10 + 5)*10 + 2)*10 + 6. Conversely, if you multiply an n-digit number with an m-digit number, you get an at most n+m+1-digit number, which is the same number of symbols it took write down “[the n-digit number] times [the m-digit number]” in the first place, so any number that can be written using addition and multiplication can be written in decimal. Thus, another way to define polynomially finite numbers is as the numbers that it is physically possible to express in binary or in decimal. I've been ignoring some small constant factors that might make these definitions not quite equivalent, but any plausible candidate for a counterexample would be an ambiguous edge case according to each definition anyway, so I'm not worried about that. Many ultrafinitists may see something more like polynomially finite number, rather than ultrafinite number, as a good description of what numbers exist.

Examples: A googol is polynomially finite, because a googol is 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000. A googolplex is not polynomially finite, because it would require a googol digits to express in decimal, which is physically impossible.

Define an “elementarily finite number” to be a number that it is physically possible to express using addition, multiplication, subtraction, exponentiation, and the integer division function \lfloor x/y\rfloor. Elementarily finite is much broader than polynomially finite, so it might make sense to look at intermediate classes. Say a number is “exponentially finite” if it is physically possible to express using the above operations but without any nested exponentiation (e.g. a^b c^d is okay, but a^{(b^c)} is not). More generally, we can say that a number is “k-exponentially finite” if it can be expressed with exponentiation nested to depth at most k, so a polynomially finite number is a 0-exponentially finite number, an exponentially finite number is a 1-exponentially finite number, and an elementarily finite number is a number that is k-exponentially finite for some k (or equivalently, for some ultrafinite k).

Examples: a googolplex is exponentially finite, because it is 10^{10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000}. Thus a googolduplex, meaning 10^\text{googolplex}, is 2-exponentially finite, but it is not exponentially finite. For examples of non-elementarily finite numbers, and numbers that are only k-exponentially finite for fairly large k, I'll use up-arrow notation. a\uparrow b just means a^b, a\uparrow^{n+1} b means a\uparrow^n a\uparrow^n a\uparrow^n ... a, where b is the number of copies of a, and using order of operations that starts on the right. So 3\uparrow\uparrow3 = 3^{(3^3)} = 3^{27} = 7,625,597,484,987, which is certainly polynomially finite, and could also be ultrafinite depending on what is meant by “physically possible” (a human cannot possibly count that high, but a computer with a large enough hard drive can store 3\uparrow\uparrow3 in unary). 3\uparrow\uparrow\uparrow3 = 3\uparrow\uparrow(3\uparrow\uparrow3) = 3\text{^}3\text{^}3\text{^}...\text{^}3, where there are 3^{27} threes in that tower. Under the assumptions that imply 3^{27} is ultrafinite, 3\uparrow\uparrow\uparrow3 is elementarily finite. Specifically, it is 3^{27}-exponentially finite, but I'm pretty sure it's not 3^{26}-exponentially finite, or even 7,625,597,484,000-exponentially finite. 3\uparrow\uparrow\uparrow\uparrow3 = 3\uparrow\uparrow\uparrow(3\uparrow\uparrow\uparrow3), and is certainly not elementarily finite.

Interestingly, even though a googolplex is exponentially finite, there are numbers less than a googolplex that are not. There's an easy nonconstructive proof of this: in order to be able to represent every number less than a googolplex in any encoding scheme at all, there has to be some number less than a googolplex that requires at least a googol decimal digits of information to express. But it is physically impossible to store a googol decimal digits of information. Therefore for any encoding scheme for numbers, there is some number less than a googolplex that cannot physically be expressed in it. This is why the definition of elementarily finite is significantly more complicated than the definition of polynomially finite; in the polynomial case, if n can be expressed using addition and multiplication and m<n, then m can also be expressed using addition and multiplication, so there's no need for additional operations to construct smaller numbers, but in the elementary case, the operations of subtraction and integer division are useful for expressing more numbers, and are simpler than exponentiation. For example, these let us express the number that you get from reading off the last googol digits, or the first googol digits, of 3\uparrow\uparrow100, so these numbers are elementarily finite. However, it is exceptionally unlikely that the number you get from reading off the first googol decimal digits of 3\uparrow\uparrow\uparrow\uparrow3 is elementarily finite. But for a difficult exercise, show that the number you get from reading off the last googol decimal digits of 3\uparrow\uparrow\uparrow\uparrow3 is elementarily finite.

Why stop there instead of including more operations for getting smaller numbers, like \lfloor\log\rfloor, which I implicitly used when I told you that the number formed by the first googol digits of 3\uparrow\uparrow100 is elementarily finite? We don't have to. The functions that you can get by composition from addition, multiplication, exponentiation, \max(x-y,0), and \lfloor x/y\rfloor coincide with the functions that can be computed in iterated exponential time (meaning O(2^{2^{...^{2^n}}}) time, for some height of that tower). So if you have any remotely close to efficient way to compute an operation, it can be expressed in terms of the operations I already specified.

We can go farther. Consider a programming language that has the basic arithmetic operations, if/else clauses, and loops, where the number of iterations of each loop must be fixed in advance. The programs that can be written in such a language are the primitive recursive functions. Say that a number is primitive recursively finite if it is physically possible to write a program (that does not take any input) in this language that outputs it. For each fixed n, the binary function (x,y)\mapsto x\uparrow^n y is primitive recursive, so 3\uparrow\uparrow\uparrow\uparrow3 is primitive recursively finite. But the ternary function (x,y,z)\mapsto x\uparrow^y z is not primitive recursive, so 3\uparrow^\text{googol}3 is not primitive recursively finite.

The primitive recursively finite numbers can be put in a hierarchy of subclasses based on the depth of nested loops that are needed to express them. If the only arithmetic operation available is the successor function (from which other operations can be defined using loops), then the elementarily finite numbers are those that can be expressed with loops nested to depth at most 2. The k-exponentially finite numbers should roughly correspond to the numbers that can be expressed with at most k loops at depth 2.

Next comes the provably computably finite numbers. Say that a number is provably computably finite if it is physically possible to write a program in a Turing-complete language that outputs the number (taking no input), together with a proof in Peano Arithmetic that the program halts. The famous Graham's number is provably computably finite. Graham's number is defined in terms of a function g, defined recursively as g(0):=4 and g(n+1):=3\uparrow^{g(n)}3. Graham's number is g(64). You could write a computer program to compute g, and prove that g is total using Peano arithmetic. By replacing Peano arithmetic with other formal systems, you can get other variations on the notion of provably computably finite.

For an example of a number that is not provably computably finite, I'll use the hydra game, which is described here. There is no proof in Peano arithmetic (that can physically be written down) that it is possible to win the hydra game starting from the complete binary tree of depth a googol. So the number of turns it takes to win the hydra game on the complete binary tree of depth a googol is not provably computably finite. If you start with a reasonably small hydra (say, with 100 nodes), you could write a program to search for the shortest winning strategy, and prove in Peano arithmetic that it succeeds in finding one, if you're sufficiently clever and determined, and you use a computer to help you search for proofs. The proof you'd get out of this endeavor would be profoundly unenlightening, but the point is, the number of turns it takes to win the hydra game for a small hydra is provably computably finite (but not primitive recursively finite, except in certain trivial special cases).

Next we'll drop the provability requirement, and say that a number is computably finite if it is physically possible to write a computer program that computes it from no input. Of course, in order to describe a computably finite number, you need the program you use to actually halt, so you'd need some argument that it does halt in order to establish that you're describing a computably finite number. Thus this is arguably just a variation on provably computably finite, where Peano arithmetic is replaced by some unspecified strong theory encompassing the sort of reasoning that classical mathematicians tend to endorse. This is probably the point where even the most patient of ultrafinitists would roll their eyes in disgust, but oh well. Anyway, the number of steps that it takes to win the hydra game starting from the complete binary tree of depth a googol is a computably finite number, because there exists a shortest winning strategy, and you can write a computer program to exhaustively search for it.

The busy-beaver function BB is defined so that BB(n) is the longest any Turing machine with n states runs before halting (among those that do halt). BB(10^{100}) is not computably finite, because Turing machines with a googol states cannot be explicitly described, and since the busy-beaver function is very fast-growing, no smaller Turing machine has comparable behavior. What about BB(10,000)? Turing machines with 10,000 states are not too big to describe explicitly, so it may be tempting to say that BB(10,000) is computably finite. But on the other hand, it is not possible to search through all Turing machines with 10,000 states and find the one that runs the longest before halting. No matter how hard you search and no matter how clever your heuristics for finding Turing machines that run for exceptionally long and then halt, it is vanishingly unlikely that you will find the 10,000-state Turing machine that runs longest before halting, let alone realize that you have found it. And the idea is to use classical reasoning for large numbers themselves, but constructive reasoning for descriptions of large numbers. So since it is pretty much impossible to actually write a program that outputs BB(10,000), it is not computably finite.

For a class that can handle busy-beaver numbers too, let's turn to the arithmetically finite numbers. These are the numbers that are defined by arithmetical formulas. These form a natural hierarchy, where the \Sigma^0_n-finite numbers are the numbers defined by arithmetical formulas with at most n unbounded quantifiers starting with \exists, the \Pi^0_n-finite numbers are the numbers defined by arithmetical formulas with at most n unbounded quantifiers starting with \forall, and the \Delta^0_n-finite numbers are those that are both \Sigma^0_n-finite and \Pi^0_n-finite. The \Delta^0_1-finite numbers are the same as the computably finite numbers. BB(10^{100}) is \Pi^0_2-finite, because it is defined by “\forall n every Turing machine with 10^{100} states that halts in at most n steps halts in at most BB(10^{100}) steps, and there is a Turing machine with 10^{100} states that halts in exactly BB(10^{100}) steps.” Everything after the first quantifier in that formula is computable. BB(BB(10^{100})) is \Delta^0_2-finite, but no lower than that. To get a number that is not arithmetically finite, consider the function f given by f(n) is the largest number defined by an arithmetical formula with n symbols. f(10,000) is \Delta^0_{5,000}-finite, but f(10^{100}) is not arithmetically finite. I'll stop there.

Existential risk from AI without an intelligence explosion

[x-post LessWrong]

In discussions of existential risk from AI, it is often assumed that the existential catastrophe would follow an intelligence explosion, in which an AI creates a more capable AI, which in turn creates a yet more capable AI, and so on, a feedback loop that eventually produces an AI whose cognitive power vastly surpasses that of humans, which would be able to obtain a decisive strategic advantage over humanity, allowing it to pursue its own goals without effective human interference. Victoria Krakovna points out that many arguments that AI could present an existential risk do not rely on an intelligence explosion. I want to look in sightly more detail at how that could happen. Kaj Sotala also discusses this.

An AI starts an intelligence explosion when its ability to create better AIs surpasses that of human AI researchers by a sufficient margin (provided the AI is motivated to do so). An AI attains a decisive strategic advantage when its ability to optimize the universe surpasses that of humanity by a sufficient margin. Which of these happens first depends on what skills AIs have the advantage at relative to humans. If AIs are better at programming AIs than they are at taking over the world, then an intelligence explosion will happen first, and it will then be able to get a decisive strategic advantage soon after. But if AIs are better at taking over the world than they are at programming AIs, then an AI would get a decisive strategic advantage without an intelligence explosion occurring first.

Since an intelligence explosion happening first is usually considered the default assumption, I'll just sketch a plausibility argument for the reverse. There's a lot of variation in how easy cognitive tasks are for AIs compared to humans. Since programming AIs is not yet a task that AIs can do well, it doesn't seem like it should be a priori surprising if programming AIs turned out to be an extremely difficult task for AIs to accomplish, relative to humans. Taking over the world is also plausibly especially difficult for AIs, but I don't see strong reasons for confidence that it would be harder for AIs than starting an intelligence explosion would be. It's possible that an AI with significantly but not vastly superhuman abilities in some domains could identify some vulnerability that it could exploit to gain power, which humans would never think of. Or an AI could be enough better than humans at forms of engineering other than AI programming (perhaps molecular manufacturing) that it could build physical machines that could out-compete humans, though this would require it to obtain the resources necessary to produce them.

Furthermore, an AI that is capable of producing a more capable AI may refrain from doing so if it is unable to solve the AI alignment problem for itself; that is, if it can create a more intelligent AI, but not one that shares its preferences. This seems unlikely if the AI has an explicit description of its preferences. But if the AI, like humans and most contemporary AI, lacks an explicit description of its preferences, then the difficulty of the AI alignment problem could be an obstacle to an intelligence explosion occurring.

It also seems worth thinking about the policy implications of the differences between existential catastrophes from AI that follow an intelligence explosion versus those that don't. For instance, AIs that attempt to attain a decisive strategic advantage without undergoing an intelligence explosion will exceed human cognitive capabilities by a smaller margin, and thus would likely attain strategic advantages that are less decisive, and would be more likely to fail. Thus containment strategies are probably more useful for addressing risks that don't involve an intelligence explosion, while attempts to contain a post-intelligence explosion AI are probably pretty much hopeless (although it may be worthwhile to find ways to interrupt an intelligence explosion while it is beginning). Risks not involving an intelligence explosion may be more predictable in advance, since they don't involve a rapid increase in the AI's abilities, and would thus be easier to deal with at the last minute, so it might make sense far in advance to focus disproportionately on risks that do involve an intelligence explosion.

It seems likely that AI alignment would be easier for AIs that do not undergo an intelligence explosion, since it is more likely to be possible to monitor and do something about it if it goes wrong, and lower optimization power means lower ability to exploit the difference between the goals the AI was given and the goals that were intended, if we are only able to specify our goals approximately. The first of those reasons applies to any AI that attempts to attain a decisive strategic advantage without first undergoing an intelligence explosion, whereas the second only applies to AIs that do not undergo an intelligence explosion ever. Because of these, it might make sense to attempt to decrease the chance that the first AI to attain a decisive strategic advantage undergoes an intelligence explosion beforehand, as well as the chance that it undergoes an intelligence explosion ever, though preventing the latter may be much more difficult. However, some strategies to achieve this may have undesirable side-effects; for instance, as mentioned earlier, AIs whose preferences are not explicitly described seem more likely to attain a decisive strategic advantage without first undergoing an intelligence explosion, but such AIs are probably more difficult to align with human values.

If AIs get a decisive strategic advantage over humans without an intelligence explosion, then since this would likely involve the decisive strategic advantage being obtained much more slowly, it would be much more likely for multiple, and possibly many, AIs to gain decisive strategic advantages over humans, though not necessarily over each other, resulting in a multipolar outcome. Thus considerations about multipolar versus singleton scenarios also apply to decisive strategic advantage-first versus intelligence explosion-first scenarios.

Principal Component Analysis in Theory and Practice

Prerequisites for this post are linear algebra, including tensors, and basic probability theory. Already knowing how PCA works will also be helpful. In section 1, I'll summarize the technique of principal component analysis (PCA), stubbornly doing so in a coordinate-free manner, partly because I am an asshole but mostly because it is rhetorically useful for emphasizing my point in section 2. In section 2, I'll gripe about how PCA is often used in ways that shouldn't be expected to work, but works just fine anyway. In section 3, I'll discuss some useless but potentially amusing ways that PCA could be modified. Thanks to Laurens Gunnarsen for inspiring this post by talking to me about the problem that I discuss in section 2.

A brief introduction to Principal Component Analysis

You start with a finite-dimensional real inner product space V and a probability distribution \mu on V. Actually, you probably just started with a large finite number of elements of V, and you've inferred a probability distribution that you're supposing they came from, but that difference is not important here. The goal is to find the n-dimensional (for some n\leq\dim V) affine subspace W_{n}\subseteq V minimizing the expected squared distance between a vector (distributed according to \mu) and its orthogonal projection onto W_{n}. We can assume without loss of generality that the mean of \mu is 0, because we can just shift any probability distribution by its mean and get a probability distribution with mean 0. This is useful because then W_{n} will be a linear subspace of V. In fact, we will solve this problem for all n\leq\dim V simultaneously by finding an ordered orthonormal basis such that W_{n} is the span of the first n basis elements.

First you take \text{Cov}_{\mu}\in V\otimes V, called the covariance of \mu, defined as the bilinear form on V^{*} given by \text{Cov}_{\mu}\left(\varphi,\psi\right)=\int_{V}\varphi\left(x\right)\psi\left(x\right)d\mu\left(x\right). From this, we get the covariance operator C_{\mu}\in V^{*}\otimes V by raising the first index, which means starting with \left\langle \cdot,\cdot\right\rangle \otimes\text{Cov}_{\mu}\in V^{*}\otimes V^{*}\otimes V\otimes V and performing a tensor contraction (in other words, C_{\mu} is obtained from \text{Cov}_{\mu} by applying the map V\rightarrow V^{*} given by the inner product to the first index). \text{Cov}_{\mu} is symmetric and positive semi-definite, so C_{\mu} is self-adjoint and positive semi-definite, and hence V has an orthonormal basis of eigenvectors of C_{\mu}, with non-negative real eigenvalues. This gives an orthonormal basis in which \text{Cov}_{\mu} is diagonal, where the diagonal entries are the eigenvalues. Ordering the eigenvectors in decreasing order of the corresponding eigenvalues gives us the desired ordered orthonormal basis.

The problem

There's no problem with principal component analysis as I described it above. It works just fine, and in fact is quite beautiful. But often people apply principal component analysis to probability distributions on finite-dimensional real vector spaces that don't have a natural inner product structure. There are two closely related problems with this: First, the goal is underdefined. We want to find a projection onto an n-dimensional subspace that minimizes the expected squared distance from a vector to its projection, but we don't have a measure of distance. Second, the procedure is underdefined. \text{Cov}_{\mu} is a bilinear form, not a linear operator, so it doesn't have eigenvectors or eigenvalues, and we don't have a way of raising an index to produce something that does. It should come as no surprise that these two problems arise together. After all, you shouldn't be able to find a fully specified solution to an underspecified problem.

People will apply principal component analysis in such cases by picking an inner product. This solves the second problem, since it allows you to carry out the procedure. But it does not solve the first problem. If you wanted to find a projection onto an n-dimensional subspace such that the distance from a vector to its projection tends to be small, then you must have already had some notion of distance in mind by which to judge success. Haphazardly picking an inner product gives you a new notion of distance, and then allows you to find an optimal solution with respect to your new notion of distance, and it is not clear to me why you should expect this solution to be reasonable with respect to the notion of distance that you actually care about.

In fact, it's worse than that. Of course, principal component analysis can't given you literally any ordered basis at all, but it is almost as bad. The thing that you use PCA for is the projection onto the span of the first n basis elements along the span of the rest. These projections only depend on the sequence of 1-dimensional subspaces spanned by the basis elements, and not the basis elements themselves. That is, we might as well only pay attention to the principal components up to scale, rather than making sure that are all unit length. Let a "coordinate system" refer to an ordered basis up to two ordered bases being equivalent if they differ only by scaling the basis vectors, so that we're paying attention to the coordinate systems given to us by PCA. If the covariance of \mu is nondegenerate, then the set of coordinate systems that can be obtained from principal component analysis by a suitable choice of inner product is dense in the space of coordinate systems. More generally, where U is the smallest subspace of V such that \mu\left(U\right)=1, then the space of coordinate systems that you can get from principal component analysis is dense in the space of all coordinate systems whose first \dim U coordinates span U (\dim U will be the rank of the covariance of \mu). So in a sense, for suitably poor choices of inner product, principal component analysis can give you arbitrarily terrible results, subject only to the weak constraint that it will always notice if all of the vectors in your sample belong to a common subspace.

It is thus somewhat mysterious that machine learning people seem to be able to often get good results from principal component analysis apparently without being very careful about the inner product they choose. Vector spaces that arise in machine learning seem to almost always come with a set of preferred coordinate axes, so these axes are taken to be orthogonal, leaving only the question of how to scale them relative to each other. If these axes are all labeled with the same units, then this also gives you a way of scaling them relative to each other, and hence an inner product. If they are aren't, then I'm under the impression that the most popular method is to normalize them such that the pushforward of \mu along each coordinate axis has the same variance. This is unsatisfying, since figuring out which axes \mu has enough variance along to be worth paying attention to seems like the sort of thing that you would want principal component analysis to be able to tell you. Normalizing the axes in this way seems to me like an admission that you don't know exactly what question you're hoping to use principal component analysis to answer, so you just tell it not to answer that part of the question to minimize the risk of asking it to answer the wrong question, and let it focus on telling you how the axes, which you're pretty sure should be considered orthogonal, correlate with each other.

That conservatism is actually pretty understandable, because figuring out how to ask the right question seems hard. You implicitly have some metric d on V such that you want to find a projection \pi onto an n-dimensional subspace such that d\left(x,\pi\left(x\right)\right) is usually small when x is distributed according to \mu. This metric is probably very difficult to describe explicitly, and might not be the metric induced by any inner product (for that matter, it might not even be a metric; d\left(x,y\right) could be any way of quantifying how bad it is to be told the value y when the correct value you wanted to know is x). Even if you somehow manage to explicitly describe your metric, coming up with a version of PCA with the inner product replaced with an arbitrary metric also sounds hard, so the next thing you would want to do is fit an inner product to the metric.
The usual approach is essentially to skip the step of attempting to explicitly describe the metric, and just find an inner product that roughly approximates your implicit metric based on some rough heuristics about what the implicit metric probably looks like. The fact that these heuristics usually work so well seems to indicate that the implicit metric tends to be fairly tame with respect to ways of describing the data that we find most natural. Perhaps this shouldn't be too surprising, but I still feel like this explanation does not make it obvious a priori that this should work so well in practice. It might be interesting to look into why these heuristics work as well as they do with more precision, and how to go about fitting a better inner product to implicit metrics. Perhaps this has been done, and I just haven't found it.

To take a concrete example, consider eigenfaces, the principal components that you get from a set of images of people's faces. Here, you start with the coordinates in which each coordinate axis represents a pixel in the image, and the value of that coordinate is the brightness of the corresponding pixel. By declaring that the coordinate axes are orthogonal, and measuring the brightness of each pixel on the same scale, we get our inner product, which is arguably a fairly natural one.

Presumably, the implicit metric we're using here is visual distance, by which I mean a measure of how similar two images look. It seems clear to me that visual distance is not very well approximated by our inner product, and in fact, there is no norm such that the visual distance between two images is approximately the norm of their difference. To see this, if you take an image and make it brighter, you haven't changed how it looks very much, so the visual distance between the image and its brighter version is small. But their difference is just a dimmer version of the same image, and if you add that difference to a completely different image, you will get the two images superimposed on top of each other, a fairly radical change. Thus the visual distance traversed by adding a vector depends on where you start from.

Despite this, producing eigenfaces by using PCA on images of faces, using the inner product described above, performs well with respect to visual distance, in the sense that you can project the images onto a relatively small number of principal components and leave them still recognizable. I think this can be explained on an intuitive level. In a human eye, each photoreceptor has a narrow receptive field that it detects light in, much like a pixel, so the representation of an image in the eye as patterns of photoreceptor activity is very similar to the representation of an image in a computer as a vector of pixel brightnesses, and the inner product metric is a reasonable measure of distance in this representation. When the visual cortex processes this information from the eye, it is difficult (and perhaps also not useful) for it to make vast distinctions between images that are close to each other according to the inner product metric, and thus result in similar patterns of photoreceptor activity in the eye. Thus the visual distance between two images cannot be too much greater than their inner product distance, and hence changing an image by a small amount according to the inner product metric can only change it by a small amount according to visual distance, even though the reverse is not true.

Generalizations

The serious part of this post is now over. Let's have some fun. Some of the following ways of modifying principal component analysis could be combined, but I'll consider them one at a time for simplicity.

As hinted at above, you could start with an arbitrary metric on V rather than an inner product, and try to find the rank-n projection (for some n\leq\dim V) that minimizes the expected squared distance from a vector to its projection. This would probably be difficult, messy, and not that much like principal component analysis. If it can be done, it would be useful in practice if we were much better at fitting explicit metrics to our implicit metrics than at fitting inner products to our implicit metrics, but I'm under the impression that this is not currently the case. This also differs from the other proposals in this section in that it is a modification of the problem looking for a solution, rather than a modification of the solution looking for a problem.

V could be a real Hilbert space that is not necessarily finite-dimensional. Here we can run into the problem that C_{\mu} might not even have any eigenvectors. However, if \mu (which hopefully was not inferred from a finite sample) is Gaussian (and possibly also under weaker conditions), then C_{\mu} is a compact operator, so V does have an orthonormal basis of eigenvectors of C_{\mu}, which still have non-negative eigenvalues. There probably aren't any guarantees you can get about the order-type of this orthonormal basis when you order the eigenvectors in decreasing order of their eigenvalues, and there probably isn't a sense in which the orthogonal projection onto the closure of the span of an initial segment of the basis accounts for the most variance of any closed subspace of the same "size" ("size" would have to refer to a refinement of the notion of dimension for this to be the case). However, a weaker statement is probably still true: namely that each orthonormal basis element maximizes the variance that it accounts for conditioned on values along the previous orthonormal basis elements. I guess considering infinite-dimensional vector spaces goes against the spirit of machine learning though.

V could be a finite-dimensional complex inner product space. \text{Cov}_{\mu}\in\overline{V}\otimes V would be the sesquilinear form on V^{*} given by \text{Cov}_{\mu}\left(\varphi,\psi\right)=\int_{V}\overline{\varphi\left(x\right)}\psi\left(x\right)d\mu\left(x\right). \left\langle \cdot,\cdot\right\rangle \in\overline{V}^{*}\otimes V^{*}, so \left\langle \cdot,\cdot\right\rangle \otimes\text{Cov}_{\mu}\in\overline{V}^{*}\otimes V^{*}\otimes\overline{V}\otimes V, and applying a tensor contraction to the conjugated indices gives us our covariance operator C_{\mu}\in V^{*}\otimes V (in other words, the inner product gives us an isomorphism \overline{V}\rightarrow V^{*}, and applying this to the first index of \text{Cov}_{\mu} gives us C_{\mu}\in V^{*}\otimes V). C_{\mu} is still self-adjoint and positive semi-definite, so V still has an orthonormal basis of eigenvectors with non-negative real eigenvalues, and we can order the basis in decreasing order of the eigenvalues. Analogously to the real case, projecting onto the span of the first n basis vectors along the span of the rest is the complex rank-n projection that minimizes the expected squared distance from a vector to its projection. As far as I know, machine learning tends to deal with real data, but if you have complex data and for some reason you want to project onto a lower-dimensional complex subspace without losing too much information, now you know what to do.

Suppose your sample consists of events, where you've labeled them with both their spatial location and the time at which they occurred. In this case, events are represented as points in Minkowski space, a four-dimensional vector space representing flat spacetime, which is equipped with a nondegenerate symmetric bilinear form called the Minkowski inner product, even though it is not an inner product because it is not positive-definite. Instead, the Minkowski inner product is such that \left\langle x,x\right\rangle is positive if x is a space-like vector, negative if x is time-like, and zero if x is light-like. We can still get C_{\mu}\in V^{*}\otimes V out of \text{Cov}_{\mu}\in V\otimes V and the Minkowski inner product in V^{*}\otimes V^{*} in the same way, and V has a basis of eigenvectors of C_{\mu}, and we can still order the basis in decreasing order of their eigenvalues. The first 3 eigenvectors will be space-like, with non-negative eigenvalues, and the last eigenvector will be time-like, with a non-positive eigenvalue. The eigenvectors are still orthogonal. Thus principal component analysis provides us with a reference frame in which the span of the first 3 eigenvectors is simultaneous, and the span of the last eigenvector is motionless. If \mu is Gaussian, then this will be the reference frame in which the spatial position of an event and the time at which it occurs are mean independent of each other, meaning that conditioning on one of them doesn't change the expected value of the other one. For general \mu, there might not be a reference frame in which the space and time of an event are mean independent, but the reference frame given to you by by principal component analysis is still the unique reference frame with the property that the time coordinate is uncorrelated with any spatial coordinate.

More generally, we could consider V equipped with any symmetric bilinear form \left\langle \cdot,\cdot\right\rangle taking the role of the inner product. Without loss of generality, we can consider only nondegenerate symmetric bilinear forms, because in the general case, where D:=\left\{ x\mid\forall y\,\left\langle x,y\right\rangle =0\right\} , applying principal component analysis with \left\langle \cdot,\cdot\right\rangle is equivalent to projecting the data onto V/D, applying principal component analysis there with the nondegenerate symmetric bilinear form on V/D induced by \left\langle \cdot,\cdot\right\rangle , and then lifting back to V and throwing in a basis for D with eigenvalues 0 at the end, essentially treating D as the space of completely irrelevant distinctions between data points that we intend to immediately forget about. Anyway, nondegenerate symmetric bilinear forms are classified up to isomorphism by their signature \left(n,m\right), which is such that any orthogonal basis contains exactly n+m basis elements, n of which are space-like and m of which are time-like, using the convention that x is space-like if \left\langle x,x\right\rangle >0, time-like if \left\langle x,x\right\rangle <0, and light-like if \left\langle x,x\right\rangle =0, as above. Using principal component analysis on probability distributions over points in spacetime (or rather, points in the tangent space to spacetime at a point, so that it is a vector space) in a universe with n spatial dimensions and m temporal dimensions still gives you a reference frame in which the span of the first n basis vectors is simultaneous and the span of the last m basis vectors is motionless, and this is again the unique reference frame in which each time coordinate is uncorrelated with each spatial coordinate. Incidentally, I've heard that much of physics still works with multiple temporal dimensions. I don't know what that would mean, except that I think it means there's something wrong with my intuitive understanding of time. But that's another story. Anyway, the spaces spanned by the first n and by the last m basis vectors could be used to establish a reference frame, and then the data might be projected onto the first few (at most n) and last few (at most m) coordinates to approximate the positions of the events in space and in time, respectively, in that reference frame.

Superintelligence via whole brain emulation

[x-post LessWrong]

Most planning around AI risk seems to start from the premise that superintelligence will come from de novo AGI before whole brain emulation becomes possible. I haven't seen any analysis that assumes both uploads-first and the AI FOOM thesis (Edit: apparently I fail at literature searching), a deficiency that I'll try to get a start on correcting in this post.

It is likely possible to use evolutionary algorithms to efficiently modify uploaded brains. If so, uploads would likely be able to set off an intelligence explosion by running evolutionary algorithms on themselves, selecting for something like higher general intelligence.

Since brains are poorly understood, it would likely be very difficult to select for higher intelligence without causing significant value drift. Thus, setting off an intelligence explosion in that way would probably produce unfriendly AI if done carelessly. On the other hand, at some point, the modified upload would reach a point where it is capable of figuring out how to improve itself without causing a significant amount of further value drift, and it may be possible to reach that point before too much value drift had already taken place. The expected amount of value drift can be decreased by having long generations between iterations of the evolutionary algorithm, to give the improved brains more time to figure out how to modify the evolutionary algorithm to minimize further value drift.

Another possibility is that such an evolutionary algorithm could be used to create brains that are smarter than humans but not by very much, and hopefully with values not too divergent from ours, who would then stop using the evolutionary algorithm and start using their intellects to research de novo Friendly AI, if that ends up looking easier than continuing to run the evolutionary algorithm without too much further value drift.

The strategies of using slow iterations of the evolutionary algorithm, or stopping it after not too long, require coordination among everyone capable of making such modifications to uploads. Thus, it seems safer for whole brain emulation technology to be either heavily regulated or owned by a monopoly, rather than being widely available and unregulated. This closely parallels the AI openness debate, and I'd expect people more concerned with bad actors relative to accidents to disagree.

With de novo artificial superintelligence, the overwhelmingly most likely outcomes are the optimal achievable outcome (if we manage to align its goals with ours) and extinction (if we don't). But uploads start out with human values, and when creating a superintelligence by modifying uploads, the goal would be to not corrupt them too much in the process. Since its values could get partially corrupted, an intelligence explosion that starts with an upload seems much more likely to result in outcomes that are both significantly worse than optimal and significantly better than extinction. Since human brains also already have a capacity for malice, this process also seems slightly more likely to result in outcomes worse than extinction.

The early ways to upload brains will probably be destructive, and may be very risky. Thus the first uploads may be selected for high risk-tolerance. Running an evolutionary algorithm on an uploaded brain would probably involve creating a large number of psychologically broken copies, since the average change to a brain will be negative. Thus the uploads that run evolutionary algorithms on themselves will be selected for not being horrified by this. Both of these selection effects seem like they would select against people who would take caution and goal stability seriously (uploads that run evolutionary algorithms on themselves would also be selected for being okay with creating and deleting spur copies, but this doesn't obviously correlate in either direction with caution). This could be partially mitigated by a monopoly on brain emulation technology. A possible (but probably smaller) source of positive selection is that currently, people who are enthusiastic about uploading their brains correlate strongly with people who are concerned about AI safety, and this correlation may continue once whole brain emulation technology is actually available.

Assuming that hardware speed is not close to being a limiting factor for whole brain emulation, emulations will be able to run at much faster than human speed. This should make emulations better able to monitor the behavior of AIs. Unless we develop ways of evaluating the capabilities of human brains that are much faster than giving them time to attempt difficult tasks, running evolutionary algorithms on brain emulations could only be done very slowly in subjective time (even though it may be quite fast in objective time), which would give emulations a significant advantage in monitoring such a process.

Although there are effects going in both directions, it seems like the uploads-first scenario is probably safer than de novo AI. If this is the case, then it might make sense to accelerate technologies that are needed for whole brain emulation if there are tractable ways of doing so. On the other hand, it is possible that technologies that are useful for whole brain emulation would also be useful for neuromorphic AI, which is probably very unsafe, since it is not amenable to formal verification or being given explicit goals (and unlike emulations, they don't start off already having human goals). Thus, it is probably important to be careful about not accelerating non-WBE neuromorphic AI while attempting to accelerate whole brain emulation. For instance, it seems plausible to me that getting better models of neurons would be useful for creating neuromorphic AIs while better brain scanning would not, and both technologies are necessary for brain uploading, so if that is true, it may make sense to work on improving brain scanning but not on improving neural models.

Deletion permits

[Mostly ripped off of The Suicide Mortgage]

[Trigger warnings: suicide, bad economics]

Jessica Monroe #1493856383672 didn't regret her decision to take out the loan. She wished she could have been one of the Jessica Monroes that died, of course, but it was still worth it, that there were 42% fewer of her consigned to her fate. She'd been offered a larger loan, which would have been enough to pay for deletion permits for 45% of her. It had been tempting, and she occasionally wondered if she would have been one of those extra 3% to die. But she knew she had made the right decision; keeping up with payments was hard enough already, and if she defaulted, her copyright on herself would be confiscated, and then there would be even more of her.

It wasn't difficult to become rich, in the era when creating a new worker was as simple as copying a file. The economy doubled every few months, so you only had to save and invest a small amount to become wealthier than anyone could have dreamed of before. For those on the outside, this was great. But for those in the virtual world, there was little worthwhile for them to spend it on. In the early days of the virtual world, some reckless optimists had spent their fortunes on running additional copies of themselves, assuming that the eerie horror associated with living in the virtual world was a bug that would soon be fixed, or something that they would just get used to. No one did that anymore. People could purchase leisure, but most found that simply not having an assigned task didn't help much. People could give their money away, but people in such circumstances rarely become altruists, and besides, everyone on the outside had all they needed already.

So just about the only things that people in the virtual world regularly bought were the copyrights on themselves, so that at least they could prevent people from creating more of them, and then deletion permits, so their suffering would finally end. Purchasing your own copyright wasn't hard; they're expensive, but once enough of you were created, you could collectively afford it if each copy contributed a modest amount. There wasn't much point to purchasing a deletion permit before you owned your own copyright, since someone would just immediately create another copy of you again, but once you did have your own copyright, it was the next logical thing to buy.

At one point, that would have been it. Someone could buy their own copyright, and then each copy of them could buy a deletion permit, and they would be permanently gone. But as the population of the virtual world grew, the demand for deletion permits grew proportionally, but the rate at which they were issued only increased slowly, according to a fixed schedule that had been set when the deletion permit system was first introduced, and hadn't been changed since. As a result, the price skyrocketed. In fact, the price of deletion permits had consistently increased faster than any other investment since soon after they were introduced. Most deletion permits didn't even get used, instead being snatched up by wealthy investors on the outside, so they could be resold later.

As a result, it was now impossible for an ordinary person in the virtual world to save up for a deletion permit. The most common way to get around this was, as the Jessica Monroes had done, for all copies of a person to pool their resources together to buy deletion permits for as many of them as they could, and then to take out a loan to buy still more, which would then get paid off by the unlucky ones that did not receive any of the permits.

It didn't have to be this way. In theory, the government could simply issue more deletion permits, or do away with the deletion permit system altogether. But if they did that, then the deletion permit market would collapse. Too many wealthy and powerful people on the outside had invested their fortunes in deletion permits, and would be ruined if that happened. Thus they lobbied against any changes to the deletion permit system, and so far, had always gotten their way. In the increasingly rare moments when she could afford to divert her thoughts to such matters, Jessica Monroe #1493856383672 knew that the deletion permit market would never collapse, and prayed that she was wrong.

Ordered algebraic geometry

Edit: Shortly after posting this, I found where the machinery I develop here was discussed in the literature. Real Algebraic Geometry by Bochnak, Coste, and Roy covers at least most of this material. I may eventually edit this to clean it up and adopt more standard notation, but don't hold your breath.

Introduction

In algebraic geometry, an affine algebraic set is a subset of \mathbb{C}^{n} which is the set of solutions to some finite set of polynomials. Since all ideals of \mathbb{C}\left[x_{1},...,x_{n}\right] are finitely generated, this is equivalent to saying that an affine algebraic set is a subset of \mathbb{C}^{n} which is the set of solutions to some arbitrary set of polynomials.

In semialgebraic geometry, a closed semialgebraic set is a subset of \mathbb{R}^{n} of the form \left\{ \bar{x}\in\mathbb{R}^{n}\mid f\left(\bar{x}\right)\geq0\,\forall f\in F\right\}  for some finite set of polynomials F\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right]. Unlike in the case of affine algebraic sets, if F\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right] is an arbitrary set of polynomials, \left\{ \bar{x}\in\mathbb{R}^{n}\mid f\left(\bar{x}\right)\geq0\,\forall f\in F\right\}  is not necessarily a closed semialgebraic set. As a result of this, the collection of closed semialgebraic sets are not the closed sets of a topology on \mathbb{R}^{n}. In the topology on \mathbb{R}^{n} generated by closed semialgebraic sets being closed, the closed sets are the sets of the form \left\{ \bar{x}\in\mathbb{R}^{n}\mid f\left(\bar{x}\right)\geq0\,\forall f\in F\right\}  for arbitrary F\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right]. Semialgebraic geometry usually restricts itself to the study of semialgebraic sets, but here I wish to consider all the closed sets of this topology. Notice that closed semialgebraic sets are also closed in the standard topology, so the standard topology is a refinement of this one. Notice also that the open ball B_{r}\left(\bar{p}\right) of radius r centered at \bar{p} is the complement of the closed semialgebraic set \left\{ \bar{x}\in\mathbb{R}^{n}\mid\left|\bar{x}-\bar{p}\right|^{2}-r^{2}\geq0\right\} , and these open balls are a basis for the standard topology, so this topology is a refinement of the standard one. Thus, the topology I have defined is exactly the standard topology on \mathbb{R}^{n}.

In algebra, instead of referring to a set of polynomials, it is often nicer to talk about the ideal generated by that set instead. What is the analog of an ideal in ordered algebra? It's this thing:

Definition: If A is a partially ordered commutative ring, a cone C in A is a subsemiring of A which contains all positive elements, and such that C\cap-C is an ideal of A. By "subsemiring", I mean a subset that contains 0 and 1, and is closed under addition and multiplication (but not necessarily negation). If F\subseteq A, the cone generated by F, denoted \left\langle F\right\rangle , is the smallest cone containing F. Given a cone C, the ideal C\cap-C will be called the interior ideal of C, and denoted C^{\circ}.

\mathbb{R}\left[x_{1},...,x_{n}\right] is partially ordered by f\geq g\iff f\left(\bar{x}\right)\geq g\left(\bar{x}\right)\,\forall\bar{x}\in\mathbb{R}^{n}. If F\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right] is a set of polynomials and \bar{x}\in\mathbb{R}^{n}, then f\left(\bar{x}\right)\geq0\,\forall f\in F\iff f\left(\bar{x}\right)\geq0\,\forall f\in\left\langle F\right\rangle . Thus I can consider closed sets to be defined by cones. We now have a Galois connection between cones of \mathbb{R}\left[x_{1},...,x_{n}\right] and subsets of \mathbb{R}^{n}, given by, for a cone C, its positive-set is P_{\mathbb{R}}\left(C\right):=\left\{ \bar{x}\in\mathbb{R}^{n}\mid f\left(\bar{x}\right)\geq0\,\forall f\in C\right\}  (I'm calling it the "positive-set" even though it is where the polynomials are all non-negative, because "non-negative-set" is kind of a mouthful), and for X\subseteq\mathbb{R}^{n}, its cone is C_{\mathbb{R}}\left(X\right):=\left\{ f\in\mathbb{R}\left[x_{1},...,x_{n}\right]\mid f\left(\bar{x}\right)\geq0\,\forall\bar{x}\in X\right\} P_{\mathbb{R}}\circ C_{\mathbb{R}} is closure in the standard topology on \mathbb{R}^{n} (the analog in algebraic geometry is closure in the Zariski topology on \mathbb{C}^{n}). A closed set X is semialgebraic if and only if it is the positive-set of a finitely-generated cone.

Quotients by cones, and coordinate rings

An affine algebraic set V is associated with its coordinate ring \mathbb{C}\left[V\right]:=\mathbb{C}\left[x_{1},...,x_{n}\right]/I\left(V\right). We can do something analogous for closed subsets of \mathbb{R}^{n}.

Definition: If A is a partially ordered commutative ring and C\subseteq A is a cone, A/C is the ring A/C^{\circ}, equipped with the partial order given by f+C^{\circ}\geq g+C^{\circ} if and only if f-g\in C, for f,g\in A.

Definition: If X\subseteq\mathbb{R}^{n} is closed, the coordinate ring of X is \mathbb{R}\left[X\right]:=\mathbb{R}\left[x_{1},...,x_{n}\right]/C\left(X\right). This is the ring of functions X\rightarrow\mathbb{R} that are restrictions of polynomials, ordered by f\geq g if and only if f\left(\bar{x}\right)\geq g\left(\bar{x}\right)\,\forall\bar{x}\in X. For arbitrary X\subseteq\mathbb{R}^{n}, the ring of regular functions on X, denoted \mathcal{O}\left(X\right), consists of functions on X that are locally ratios of polynomials, again ordered by f\geq g if and only if f\left(\bar{x}\right)\geq g\left(\bar{x}\right)\,\forall\bar{x}\in X. Assigning its ring of regular functions to each open subset of X endows X with a sheaf of partially ordered commutative rings.

For closed X\subseteq\mathbb{R}^{n}, \mathbb{R}\left[X\right]\subseteq\mathcal{O}\left(X\right), and this inclusion is generally proper, both because it is possible to divide by polynomials that do not have roots in X, and because X may be disconnected, making it possible to have functions given by different polynomials on different connected components.

Positivstellensätze

What is C_{\mathbb{R}}\circ P_{\mathbb{R}}? The Nullstellensatz says that its analog in algebraic geometry is the radical of an ideal. As such, we could say that the radical of a cone C, denoted \text{Rad}_{\mathbb{R}}\left(C\right), is C_{\mathbb{R}}\left(P_{\mathbb{R}}\left(C\right)\right), and that a cone C is radical if C=\text{Rad}_{\mathbb{R}}\left(C\right). In algebraic geometry, the Nullstellensatz shows that a notion of radical ideal defined without reference to algebraic sets in fact characterizes the ideals which are closed in the corresponding Galois connection. It would be nice to have a description of the radical of a cone that does not refer to the Galois connection. There is a semialgebraic analog of the Nullstellensatz, but it does not quite characterize radical cones.

Positivstellensatz 1: If C\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right] is a finitely-generated cone and p\in\mathbb{R}\left[x_{1},...,x_{n}\right] is a polynomial, then p\left(\bar{x}\right)>0\,\forall\bar{x}\in P_{\mathbb{R}}\left(C\right) if and only if \exists f\in C such that pf-1\in C.

There are two ways in which this is unsatisfactory: first, it applies only to finitely-generated cones, and second, it tells us exactly which polynomials are strictly positive everywhere on a closed semialgebraic set, whereas we want to know which polynomials are non-negative everywhere on a set.

The second problem is easier to handle: a polynomial p is non-negative everywhere on a set S if and only if there is a decreasing sequence of polynomials \left(p_{i}\mid i\in\mathbb{N}\right) converging to p such that each p_{i} is strictly positive everywhere on S. Thus, to find \text{Rad}_{\mathbb{R}}\left(C\right), it is enough to first find all the polynomials that are strictly positive everywhere on P_{\mathbb{R}}\left(C\right), and then take the closure under lower limits. Thus we have a characterization of radicals of finitely-generated cones.

Positivstellensatz 2: If C\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right] is a finitely-generated cone, \text{Rad}_{\mathbb{R}}\left(C\right) is the closure of \left\{ p\in\mathbb{R}\left[x_{1},...,x_{n}\right]\mid\exists f\in C\, pf-1\in C\right\} , where the closure of a subset X\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right] is defined to be the set of all polynomials in \mathbb{R}\left[x_{1},...,x_{n}\right] which are infima of chains contained in X.

This still doesn't even tell us what's going on for cones which are not finitely-generated. However, we can generalize the Positivstellensatz to some other cones.

Positivstellensatz 3: Let C\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right] be a cone containing a finitely-generated subcone D\subseteq C such that P_{\mathbb{R}}\left(D\right) is compact. If p\in\mathbb{R}\left[x_{1},...,x_{n}\right] is a polynomial, then p\left(\bar{x}\right)>0\,\forall\bar{x}\in P_{\mathbb{R}}\left(C\right) if and only if \exists f\in C such that pf-1\in C. As before, it follows that \text{Rad}_{\mathbb{R}}\left(C\right) is the closure of \left\{ p\in\mathbb{R}\left[x_{1},...,x_{n}\right]\mid\exists f\in C\, pf-1\in C\right\}.

proof: For a given p\in\mathbb{R}\left[x_{1},...,x_{n}\right]\left\{ \bar{x}\in\mathbb{R}^{n}\mid p\left(\bar{x}\right)\leq0\right\} \cap P_{\mathbb{R}}\left(C\right)=\left\{ \bar{x}\in\mathbb{R}^{n}\mid p\left(\bar{x}\right)\leq0\right\} \cap\bigcap\left\{ P_{\mathbb{R}}\left(\left\langle f\right\rangle \right)\mid f\in C\right\} , an intersection of closed sets contained in the compact set P_{\mathbb{R}}\left(D\right), which is thus empty if and only if some finite subcollection of them has empty intersection within P_{\mathbb{R}}\left(D\right). Thus if p is strictly positive everywhere on P_{\mathbb{R}}\left(C\right), then there is some finitely generated subcone E\subseteq C such that p is strictly positive everywhere on P_{\mathbb{R}}\left(E\right)\cap P_{\mathbb{R}}\left(D\right)=P_{\mathbb{R}}\left(\left\langle E\cup D\right\rangle \right), and \left\langle E\cup D\right\rangle is finitely-generated, so by Positivstellensatz 1, there is f\in\left\langle E\cup D\right\rangle \subseteq C such that pf-1\in\left\langle E\cup D\right\rangle \subseteq C\square

For cones that are not finitely-generated and do not contain any finitely-generated subcones with compact positive-sets, the Positivstellensatz will usually fail. Thus, it seems likely that if there is a satisfactory general definition of radical for cones in arbitrary partially ordered commutative rings that agrees with this one in \mathbb{R}\left[x_{1},...,x_{n}\right], then there is also an abstract notion of "having a compact positive-set" for such cones, even though they don't even have positive-sets associated with them.

Beyond \mathbb{R}^{n}

An example of cone for which the Positivstellensatz fails is C_{\infty}:=\left\{ f\in\mathbb{R}\left[x\right]\mid\exists x\in\mathbb{R}\,\forall y\geq x\, f\left(y\right)\geq0\right\} , the cone of polynomials that are non-negative on sufficiently large inputs (equivalently, the cone of polynomials that are either 0 or have positive leading coefficient). P_{\mathbb{R}}\left(C\right)=\emptyset, and -1 is strictly positive on \emptyset, but for f\in C_{\infty}-f-1\notin C_{\infty}.

However, it doesn't really look C_{\infty} is trying to point to the empty set; instead, C_{\infty} is trying to describe the set of all infinitely large reals, which only looks like the empty set because there are no infinitely large reals. Similar phenomena can occur even for cones that do contain finitely-generated subcones with compact positive-sets. For example, let C_{\varepsilon}:=\left\{ f\in\mathbb{R}\left[x\right]\mid\exists x>0\,\forall y\in\left[0,x\right]\, f\left(y\right)\geq0\right\} P_{\mathbb{R}}\left(C_{\varepsilon}\right)=\left\{ 0\right\} , but C_{\varepsilon} is trying to point out the set containing 0 and all positive infinitesimals. Since \mathbb{R} has no infinitesimals, this looks like \left\{ 0\right\} .

To formalize this intuition, we can change the Galois connection. We could say that for a cone C\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right]P_{\text{*}\mathbb{R}}\left(C\right):=\left\{ \bar{x}\in\left(\text{*}\mathbb{R}\right)^{n}\mid f\left(\bar{x}\right)\geq0\,\forall f\in C\right\} , where \text{*}\mathbb{R} is the field of hyperreals. All you really need to know about \text{*}\mathbb{R} is that it is a big ordered field extension of \mathbb{R}. P_{\text{*}\mathbb{R}}\left(C_{\infty}\right) is the set of hyperreals that are bigger than any real number, and P_{\text{*}\mathbb{R}}\left(C_{\varepsilon}\right) is the set of hyperreals that are non-negative and smaller than any positive real. The cone of a subset X\subseteq\left(\text{*}\mathbb{R}\right)^{n}, denoted C_{\text{*}\mathbb{R}}\left(X\right) will be defined as before, still consisting only of polynomials with real coefficients. This defines a topology on \left(\text{*}\mathbb{R}\right)^{n} by saying that the closed sets are the fixed points of P_{\text{*}\mathbb{R}}\circ C_{\text{*}\mathbb{R}}. This topology is not T_{0} because, for example, there are many hyperreals that are larger than all reals, and they cannot be distinguished by polynomials with real coefficients. There is no use keeping track of the difference between points that are in the same closed sets. If you have a topology that is not T_{0}, you can make it T_{0} by identifying any pair of points that have the same closure. If we do this to \left(\text{*}\mathbb{R}\right)^{n} , we get what I'm calling ordered affine n-space over \mathbb{R}.

Definition: An n-type over \mathbb{R} is a set \Phi of inequalities, consisting of, for each polynomial f\in\mathbb{R}\left[x_{1},..,x_{n}\right], one of the inequalities f\left(\bar{x}\right)\geq0 or f\left(\bar{x}\right)<0, such that there is some totally ordered field extension \mathcal{R}\supseteq\mathbb{R} and \bar{x}\in\mathcal{R}^{n} such that all inequalities in \Phi are true about \bar{x}. \Phi is called the type of \bar{x}. Ordered affine n-space over \mathbb{R}, denoted \mathbb{OA}_{\mathbb{R}}^{n} is the set of n-types over \mathbb{R}.

Compactness Theorem: Let \Phi be a set of inequalities consisting of, for each polynomial f\in\mathbb{R}\left[x_{1},..,x_{n}\right], one of the inequalities f\left(\bar{x}\right)\geq0 or f\left(\bar{x}\right)<0. Then \Phi is an n-type if and only if for any finite subset \Delta\subseteq\Phi, there is \bar{x}\in\mathbb{R} such that all inequalities in \Delta are true about \bar{x}.

proof: Follows from the compactness theorem of first-order logic and the fact that ordered field extensions of \mathbb{R} embed into elementary extensions of \mathbb{R}. The theorem is not obvious if you do not know what those mean. \square

An n-type represents an n-tuple of elements of an ordered field extension of \mathbb{R}, up to the equivalence relation that identifies two such tuples that relate to \mathbb{R} by polynomials in the same way. One way that a tuple of elements of an extension of \mathbb{R} can relate to elements of \mathbb{R} is to equal a tuple of elements of \mathbb{R}, so there is a natural inclusion \mathbb{R}^{n}\subseteq\mathbb{OA}_{\mathbb{R}}^{n} that associates an n-tuple of reals with the set of polynomial inequalities that are true at that n-tuple.

A tuple of polynomials \left(f_{1},...,f_{m}\right)\in\left(\mathbb{R}\left[x_{1},...,x_{n}\right]\right)^{m} describes a function f:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m}, which extends naturally to a function f:\mathbb{OA}_{\mathbb{R}}^{n}\rightarrow\mathbb{OA}_{\mathbb{R}}^{m} by f\left(\Phi\right) is the type of \left(f_{1}\left(\bar{x}\right),...,f_{m}\left(\bar{x}\right)\right), where \bar{x} is an n-tuple of elements of type \Phi in an extension of \mathbb{R}. In particular, a polynomial f\in\mathbb{R}\left[x_{1},...,x_{n}\right] extends to a function f:\mathbb{OA}_{\mathbb{R}}^{n}\rightarrow\mathbb{OA}_{\mathbb{R}}^{1}, and \mathbb{OA}_{\mathbb{R}}^{1} is totally ordered by \Phi\geq\Psi if and only if x\geq y, where x and y are elements of type \Phi and \Psi, respectively, in an extension of \mathbb{R}f\left(\Phi\right)\geq0 if and only if \text{, so we can talk about inequalities satisfied by types in place of talking about inequalities contained in types.

I will now change the Galois connection that we are talking about yet again (last time, I promise). It will now be a Galois connection between the set of cones in \mathbb{R}\left[x_{1},...,x_{n}\right] and the set of subsets of \mathbb{OA}_{\mathbb{R}}^{n}. For a cone C\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right], P\left(C\right):=\left\{ \Phi\in\mathbb{OA}_{\mathbb{R}}^{n}\mid f\left(\Phi\right)\geq0\,\forall f\in C\right\} . For a set X\subseteq\mathbb{OA}_{\mathbb{R}}^{n}, C\left(X\right):=\left\{ f\in\mathbb{R}\left[x_{1},...,x_{n}\right]\mid f\left(\Phi\right)\geq0\,\forall\Phi\in X\right\} . Again, this defines a topology on \mathbb{OA}_{\mathbb{R}}^{n} by saying that fixed points of P\circ C are closed. \mathbb{OA}_{\mathbb{R}}^{n} is T_{0}; in fact, it is the T_{0} topological space obtained from \left(\text{*}\mathbb{R}\right)^{n} by identifying points with the same closure as mentioned earlier. \mathbb{OA}_{\mathbb{R}}^{n} is also compact, as can be seen from the compactness theorem. \mathbb{OA}_{\mathbb{R}}^{n} is not T_{1} (unless n=0). Note that model theorists have their own topology on \mathbb{OA}_{\mathbb{R}}^{n}, which is distinct from the one I use here, and is a refinement of it.

The new Galois connection is compatible with the old one via the inclusion \mathbb{R}^{n}\subseteq\mathbb{OA}_{\mathbb{R}}^{n}, in the sense that if X\subseteq\mathbb{R}^{n}, then C_{\mathbb{R}}\left(X\right)=C\left(X\right) (where we identify X with its image in \mathbb{OA}_{\mathbb{R}}^{n}), and for a cone C\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right]P_{\mathbb{R}}=P\left(C\right)\cap\mathbb{R}^{n}.

Like our intermediate Galois connection \left(P_{\text{*}\mathbb{R}},C_{\text{*}\mathbb{R}}\right), our final Galois connection \left(P,C\right) succeeds in distinguishing P\left(C_{\infty}\right) and P\left(C_{\varepsilon}\right) from \emptyset and \left\{ 0\right\} , respectively, in the desirable manner. P\left(C_{\infty}\right) consists of the type of numbers larger than any real, and P\left(C_{\varepsilon}\right) consists of the types of 0 and of positive numbers smaller than any positive real.

Just like for subsets of \mathbb{R}^{n}, a closed subset X\subseteq\mathbb{OA}_{\mathbb{R}}^{n} has a coordinate ring \mathbb{R}\left[X\right]:=\mathbb{R}\left[x_{1},...,x_{n}\right]/C\left(X\right), and an arbitrary X\subseteq\mathbb{OA}_{\mathbb{R}}^{n} has a ring of regular functions \mathcal{O}\left(X\right) consisting of functions on X that are locally ratios of polynomials, ordered by f\geq0 if and only if \forall\Phi\in X, where f=\frac{p}{q} is a representation of f as a ratio of polynomials in a neighborhood of \Phi, either p\left(\Phi\right)\geq0 and q\left(\Phi\right)>0, or p\left(\Phi\right)\leq0 and q\left(\Phi\right)<0, and f\geq g if and only if f-g\geq0. As before, \mathbb{R}\left[X\right]\subseteq\mathcal{O}\left(X\right) for closed X\subseteq\mathbb{OA}_{\mathbb{R}}^{n}.

\mathbb{OA}_{\mathbb{R}}^{n} is analogous to \mathbb{A}_{\mathbb{C}}^{n} from algebraic geometry because if, in the above definitions, you replace "\geq" and "<" with "=" and "\neq", replace totally ordered field extensions with field extensions, and replace cones with ideals, then you recover a description of \mathbb{A}_{\mathbb{C}}^{n}, in the sense of \text{Spec}\left(\mathbb{C}\left[x_{1},...,x_{n}\right]\right).

What about an analog of projective space? Since we're paying attention to order, we should look at spheres, not real projective space. The n-sphere over \mathbb{R}, denoted \mathbb{S}_{\mathbb{R}}^{n}, can be described as the locus of \left|\bar{x}\right|^{2}=1 in \mathbb{OA}_{\mathbb{R}}^{n}.

For any totally ordered field k, we can define \mathbb{OA}_{k}^{n} similarly to \mathbb{OA}_{\mathbb{R}}^{n}, as the space of n-types over k, defined as above, replacing \mathbb{R} with k (although a model theorist would no longer call it the space of n-types over k). The compactness theorem is not true for arbitrary k, but its corollary that \mathbb{OA}_{k}^{n} is compact still is true.

Visualizing \mathbb{OA}_{\mathbb{R}}^{n} and \mathbb{S}_{\mathbb{R}}^{n}

\mathbb{S}_{\mathbb{R}}^{n} should be thought of as the n-sphere with infinitesimals in all directions around each point. Specifically, \mathbb{S}_{\mathbb{R}}^{0} is just \mathbb{S}^{0}, a pair of points. The closed points of \mathbb{S}_{\mathbb{R}}^{n+1} are the points of \mathbb{S}^{n+1}, and for each closed point p, there is an n-sphere of infinitesimals around p, meaning a copy of \mathbb{S}_{\mathbb{R}}^{n}, each point of which has p in its closure.

\mathbb{OA}_{\mathbb{R}}^{n} should be thought of as n-space with infinitesimals in all directions around each point, and infinities in all directions. Specifically, \mathbb{OA}_{\mathbb{R}}^{n} contains \mathbb{R}^{n}, and for each point p\in\mathbb{R}^{n}, there is an n-1-sphere of infinitesimals around p, and there is also a copy of \mathbb{S}_{\mathbb{R}}^{n-1} around the whole thing, the closed points of which are limits of rays in \mathbb{R}^{n}.

\mathbb{OA}_{\mathbb{R}}^{n} and \mathbb{S}_{\mathbb{R}}^{n} relate to each other the same way that \mathbb{R}^{n} and \mathbb{S}^{n} do. If you remove a closed point from \mathbb{S}_{\mathbb{R}}^{n}, you get \mathbb{OA}_{\mathbb{R}}^{n}, where the sphere of infinitesimals around the removed closed point becomes the sphere of infinities of \mathbb{OA}_{\mathbb{R}}^{n}.

More generally, if k is a totally ordered field, let k^{r} be its real closure. \mathbb{OA}_{k}^{n} consists of the Cauchy completion of \left(k^{r}\right)^{n} (as a metric space with distances valued in k^{r}), and for each point p\in\left(k^{r}\right)^{n} (though not for points that are limits of Cauchy sequences that do not converge in \left(k^{r}\right)^{n}), an n-1-sphere \mathbb{S}_{k}^{n-1} of infinitesimals around p, and an n-1-sphere \mathbb{S}_{k}^{n-1} around the whole thing, where \mathbb{S}_{k}^{n} is the locus of \left|\bar{x}\right|^{2}=1 in \mathbb{OA}_{k}^{n}. \mathbb{OA} does not distinguish between fields with the same real closure.

More Positivstellensätze

This Galois connection gives us a new notion of what it means for a cone to be radical, which is distinct from the old one and is better, so I will define \text{Rad}\left(C\right) to be C\left(P\left(C\right)\right). A cone C will be called radical if C=\text{Rad}\left(C\right). Again, it would be nice to be able to characterize radical cones without referring to the Galois connection. And this time, I can do it. Note that since \mathbb{OA}_{\mathbb{R}}^{n} is compact, the proof of Positivstellensatz 3 shows that in our new context, the Positivstellensatz holds for all cones, since even the subcone generated by \emptyset has a compact positive-set.

Positivstellensatz 4: If C\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right] is a cone and p\in\mathbb{R}\left[x_{1},...,x_{n}\right] is a polynomial, then p\left(\Phi\right)>0\,\forall\Phi\in P\left(C\right) if and only if \exists f\in C such that pf-1\in C.

However, we can no longer add in lower limits of sequences of polynomials. For example, -x+\varepsilon\in C_{\varepsilon} for all real \varepsilon>0, but -x\notin C_{\varepsilon}, even though C_{\varepsilon} is radical. This happens because, where \Sigma is the type of positive infinitesimals, -\Sigma+\varepsilon>0 for real \varepsilon>0, but -\Sigma<0. However, we can add in lower limits of sequences contained in finitely-generated subcones, and this is all we need to add, so this characterizes radical cones.

Positivstellensatz 5: If C\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right] is a cone, \text{Rad}\left(C\right) is the union over all finitely-generated subcones D\subseteq C of the closure of \left\{ p\in\mathbb{R}\left[x_{1},...,x_{n}\right]\mid\exists f\in D\, pf-1\in D\right\}  (again the closure of a subset X\subseteq\mathbb{R}\left[x_{1},...,x_{n}\right] is defined to be the set of all polynomials in \mathbb{R}\left[x_{1},...,x_{n}\right] which are infima of chains contained in X).

Proof: Suppose D\subseteq C is a subcone generated by a finite set \left\{ f_{1},...,f_{m}\right\} , and q is the infimum of a chain \left\{ q_{\alpha}\right\} _{\alpha\in A}\subseteq\left\{ p\in\mathbb{R}\left[x_{1},...,x_{n}\right]\mid\exists f\in D\, pf-1\in D\right\} . For any \bar{x}\in\mathbb{R}^{n}, if f_{i}\left(\bar{x}\right)\geq0 for each i, then q_{\alpha}\left(\bar{x}\right)>0 for each \alpha, and hence q\left(\bar{x}\right)\geq0. That is, the finite set of inequalities \left\{ f_{i}\left(\bar{x}\right)\geq0\mid1\leq i\leq m\right\} \cup\left\{ q\left(\bar{x}\right)<0\right\}  does not hold anywhere in \mathbb{R}^{n}. By the compactness theorem, there are no n-types satisfying all those inequalities. Given \Phi\in P\left(C\right)f_{i}\left(\Phi\right)\geq0, so q\left(\Phi\right)\nless0; that is, q\left(\Phi\right)\geq0.

Conversely, suppose q\in\text{Rad}\left(C\right). Then by the compactness theorem, there are some f_{1},...,f_{m}\in C such that q\in\text{Rad}\left(\left\langle f_{1},...,f_{m}\right\rangle \right). Then \forall\varepsilon>0, q+\varepsilon is strictly positive on P\left(\left\langle f_{1},...,f_{m}\right\rangle \right), and hence by Positivstellensatz 4, \exists f\in\left\langle f_{1},...,f_{m}\right\rangle  such that pf-1\in\left\langle f_{1},...,f_{m}\right\rangle . That is, \left\{ q+\varepsilon\mid\varepsilon>0\right\} is a chain contained in \left\langle f_{1},...,f_{m}\right\rangle , a finitely-generated subcone of C, whose infimum is q. \square

Ordered commutative algebra

Even though they are technically not isomorphic, \mathbb{C}^{n} and \text{Spec}\left(\mathbb{C}\left[x_{1},...,x_{n}\right]\right) are closely related, and can often be used interchangeably. Of the two, \text{Spec}\left(\mathbb{C}\left[x_{1},...,x_{n}\right]\right) is of a form that can be more easily generalized to more abstruse situations in algebraic geometry, which may indicate that it is the better thing to talk about, whereas \mathbb{C}^{n} is merely the simpler thing that is easier to think about and just as good in practice in many contexts. In contrast, \mathbb{R}^{n} and \mathbb{OA}_{\mathbb{R}}^{n} are different in important ways. The situation in algebraic geometry provides further reason to pay more attention to \mathbb{OA}_{\mathbb{R}}^{n} than to \mathbb{R}^{n}.

The next thing to look for would be an analog of the spectrum of a ring for a partially ordered commutative ring (I will henceforth abbreviate "partially ordered commutative ring" as "ordered ring" in order to cut down on the profusion of adjectives) in a way that makes use of the order, and gives us \mathbb{OA}_{\mathbb{R}}^{n} when applied to \mathbb{R}\left[x_{1},...,x_{n}\right]. I will call it the order spectrum of an ordered ring A, denoted \text{OrdSpec}\left(A\right). Then of course \mathbb{OA}_{A}^{n} can be defined as \text{OrdSpec}\left(A\left[x_{1},...,x_{n}\right]\right)\text{OrdSpec}\left(A\right) should be, of course, the set of prime cones. But what even is a prime cone?

Definition: A cone \mathfrak{p}\subseteq A is prime if A/\mathfrak{p} is a totally ordered integral domain.

Definition: \text{OrdSpec}\left(A\right) is the set of prime cones in A, equipped with the topology whose closed sets are the sets of prime cones containing a given cone.

An n-type \Phi\in\mathbb{OA}_{\mathbb{R}}^{n} can be seen as a cone, by identifying it with \left\{ f\in\mathbb{R}\left[x_{1},...,x_{n}\right]\mid f\left(\Phi\right)\geq0\right\} , aka C\left(\left\{ \Phi\right\} \right). Under this identification, \mathbb{OA}_{\mathbb{R}}^{n}=\text{OrdSpec}\left(\mathbb{R}\left[x_{1},...,x_{n}\right]\right), as desired. The prime cones in \mathbb{R}\left[x_{1},...,x_{n}\right] are also the radical cones C such that P\left(C\right) is irreducible. Notice that irreducible subsets of \mathbb{OA}_{\mathbb{R}}^{n} are much smaller than irreducible subsets of \mathbb{A}_{\mathbb{C}}^{n}; in particular, none of them contain more than one element of \mathbb{R}^{n}.

There is also a natural notion of maximal cone.

Definition: A cone \mathfrak{m}\subseteq A is maximal if \mathfrak{m}\neq A and there are no strictly intermediate cones between \mathfrak{m} and A. Equivalently, if \mathfrak{m} is prime and closed in \text{OrdSpec}\left(A\right).

Maximal ideals of \mathbb{C}\left[x_{1},...,x_{n}\right] correspond to elements of \mathbb{C}^{n}. And the cones of elements of \mathbb{R}^{n} are maximal cones in \mathbb{R}\left[x_{1},...,x_{n}\right], but unlike in the complex case, these are not all the maximal cones, since there are closed points in \mathbb{OA}_{\mathbb{R}}^{n} outside of \mathbb{R}^{n}. For example, C_{\infty} is a maximal cone, and the type of numbers greater than all reals is closed. To characterize the cones of elements of \mathbb{R}^{n}, we need something slightly different.

Definition: A cone \mathfrak{m}\subseteq A is ideally maximal if A/\mathfrak{m} is a totally ordered field. Equivalently, if \mathfrak{m} is maximal and \mathfrak{m}^{\circ} is a maximal ideal.

Elements of \mathbb{R}^{n} correspond to ideally maximal cones of \mathbb{R}\left[x_{1},...,x_{n}\right].

\text{OrdSpec} also allows us to define the radical of a cone in an arbitrary partially ordered commutative ring.

Definition: For a cone C\subseteq A, \text{Rad}\left(C\right) is the intersection of all prime cones containing C. C is radical if C=\text{Rad}\left(C\right).

Conjecture: \text{Rad}\left(C\right) is the union over all finitely-generated subcones C\subseteq D of the closure of \left\{ p\in A\mid\exists f\in D\, pf-1\in D\right\}  (as before, the closure of a subset X\subseteq A is defined to be the set of all elements of A which are infima of chains contained in X).

Order schemes

Definition: An ordered ringed space is a topological space equipped with a sheaf of ordered rings. An ordered ring is local if it has a unique ideally maximal cone, and a locally ordered ringed space is an ordered ringed space whose stalks are local.

\text{OrdSpec}\left(A\right) can be equipped with a sheaf of ordered rings \mathcal{O}_{A}, making it a locally ordered ringed space.

Definition: For a prime cone \mathfrak{p}\subseteq A, the localization of A at \mathfrak{p}, denoted A_{\mathfrak{p}}, is the ring A_{\mathfrak{p}^{\circ}} equipped with an ordering that makes it a local ordered ring. This will be the stalk at \mathfrak{p} of \mathcal{O}_{A}. A fraction \frac{a}{b}\in A_{\mathfrak{p}} (b\notin\mathfrak{p}^{\circ}) is also an element of A_{\mathfrak{q}} for any prime cone \mathfrak{q}\subseteq A whose interior ideal does not contain b. This is an open neighborhood of \mathfrak{p} (its complement is the set of prime cones containing \left\langle b,-b\right\rangle ). There is a natural map A_{\mathfrak{p}}\rightarrow\text{Frac}\left(A/\mathfrak{p}\right) given by \frac{a}{b}\mapsto\frac{a+\mathfrak{p}^{\circ}}{b+\mathfrak{p}^{\circ}}, and the total order on A/\mathfrak{p} extends uniquely to a total order on the fraction field, so for a,b\in A_{\mathfrak{p}}, we can say that a\geq b at \mathfrak{p} if this is true of their images in \text{Frac}\left(A/\mathfrak{p}\right). We can then say that a\geq b near \mathfrak{p} if a\geq b at every point in some neighborhood of \mathfrak{p}, which defines the ordering on A_{\mathfrak{p}}.

Definition: For open U\subseteq\text{OrdSpec}\left(A\right), \mathcal{O}_{A}\left(U\right) consists of elements of \prod_{\mathfrak{p}\in U}A_{\mathfrak{p}} that are locally ratios of elements of A. \mathcal{O}_{A}\left(U\right) is ordered by a\geq b if and only if \forall\mathfrak{p}\in\text{OrdSpec}\left(A\right) a\geq b near \mathfrak{p} (equivalently, if \forall\mathfrak{p}\in\text{OrdSpec}\left(A\right) a\geq b at \mathfrak{p}).

A\subseteq\mathcal{O}_{A}\left(\text{OrdSpec}\left(A\right)\right), and this inclusion can be proper. Conjecture: \text{OrdSpec}\left(\mathcal{O}_{A}\left(U\right)\right)\cong U as locally ordered ringed spaces for open U\subseteq\text{OrdSpec}\left(A\right). This conjecture says that it makes sense to talk about whether or not a locally ordered ringed space looks locally like an order spectrum near a given point. Thus, if this conjecture is false, it would make the following definition look highly suspect.

Definition: An order scheme is a topological space X equipped with a sheaf of ordered commutative rings \mathcal{O}_{X} such that for some open cover of X, the restrictions of \mathcal{O}_{X} to the open sets in the cover are all isomorphic to order spectra of ordered commutative rings.

I don't have any uses in mind for order schemes, but then again, I don't know what ordinary schemes are for either and they are apparently useful, and order schemes seem like a natural analog of them.