Admin-controllable synchronous execution support #2168

SolraBizna · 2016-12-09T03:39:53Z

OpenComputers computers execute in worker threads separate from the world thread. 99.9% of the time, this is a very good thing. Sometimes, however, precise tick-perfect timing is needed. Nuclear reactor controllers, note block playback engines, and Endergenic Generators are all examples of things where tick jitter can have negative consequences. (Obviously, some of these have higher stakes than others.)

This PR adds the ability for computers to request synchronous execution, and be executed on the world thread. Since doing this across the board would be suicidal, it also adds features that allow admins to control where it can be used.

These changes were inspired by @gamax92 's experiment earlier today, and guided by feedback in #oc. @gamax92 was incredibly helpful in tracking down almost every change that needed to be made. Skye requested the microcontrollerOnly option.

Defaults

Unless specifically configured otherwise, no computer is ever Secure, and a Secure boot is required to be synchronous. With an unchanged configuration, the only change from this PR is the ability for computers to voluntarily shorten their own execution timeout.

Secure Boot

There is a new option in the configuration file, computer.sync.whitelist. When a computer first boots up, if its architecture name or EEPROM checksum is in that whitelist, it boots into a Secure state. Otherwise, it boots into an Insecure state. This allows admins to strictly control what code is given the dangerous powers this PR adds. (If the config setting computer.sync.microcontrollerOnly is true, only Microcontrollers can be Secure.)

Secure computers are allowed to request synchronous operation. If the config setting computer.sync.allowOverrideTimeout is true, Secure computers can also increase their execution timeout. (Both Secure and Insecure computers can decrease it, regardless of this config option.)

A Secure computer may enter the Insecure state at will with component.computer.giveUpSyncPrivileges(). Once it is Insecure, the only way for it to become Secure again is to reboot with a whitelisted architecture or EEPROM. This would allow "secure bootloaders" to pass Secure status only to trusted code.

Whitelisting a specific EEPROM is useful if the admin has vetted it as being "safe"; either a bootloader that only boots admin-signed code, or a self-contained "safe" program on its own. Whitelisting a specific architecture is useful if the architecture in question has its own lag-friendly execution caps. (With OC-ARM, for instance, four computers configured to run at 100x the normal clock rate, running busy looping test programs, only added ~8 milliseconds to each tick.) Either way, the computer doesn't actually become synchronous until synchronous operation is requested.

Synchronous Operation

Computers may request synchronous operation with component.computer.setSyncMode(true). Secure computers may always request synchronous operation. Insecure computers may also do so if the (dangerous!) config setting computer.sync.insecure is true.

A synchronous computer is executed on the world thread rather than in a worker thread. This means that it can respond to events in the world with tick-perfect timing. This also means it can horribly lag the world if its execution takes too long, which is why the Secure Boot feature is there.

Computers that wish to do so can return to asynchronous operation with component.computer.setSyncMode(false). If a computer is in synchronous mode and computer.sync.insecure is false and component.computer.giveUpSyncPrivileges() is called, that will also cause it to become asynchronous.

Custom Timeouts

To help mitigate potential timeout-related problems, this PR also adds a computer.setCustomTimeout function to machine.lua. Normally, only shorter timeouts than the configured default are honored; this allows synchronous computers to voluntarily limit their world-lagging potential. If the config option computer.sync.allowOverrideTimeout is set, Secure computers can also lengthen the timeout. They should only do this when the alternative is oblivion; e.g. nuclear reactor control computers.

payonel · 2016-12-09T05:11:10Z

if a machine is in synchronous mode, and it computer.pullSignal(0)s, is my machine coroutine resumed at MOST in the next tick?

Vexatos · 2016-12-09T05:30:56Z

It's a good thing that EEPROMs and architectures can only be added directly via the config file; it's a tedious process making sure that you really, really know what you're getting yourself into.

SolraBizna · 2016-12-09T06:12:21Z

if a machine is in synchronous mode, and it computer.pullSignal(0)s, is my machine coroutine resumed at MOST in the next tick?

Yes. If there is a signal in the queue, it will return immediately. If there is not a signal in the queue, it will return next tick. In either case, do not pass go and do not collect $200.

payonel

I have some suggestions for refactoring a bit. Afterwards I would be glad to test this.

payonel · 2016-12-09T05:14:05Z

src/main/resources/assets/opencomputers/lua/machine.lua

+  setCustomTimeout = function(timeout)
+     if timeout ~= nil then
+        checkArg(1, timeout, "number")
+     end


please consider changing to checkArg(1, timeout, "number", "nil")

payonel · 2016-12-09T05:15:47Z

src/main/resources/assets/opencomputers/lua/machine.lua

@@ -12,6 +12,8 @@ local function checkDeadline()
  end
 end

+local customTimeout = nil
+


let's move this up with the other machine variables at line 4

payonel · 2016-12-09T06:28:26Z

src/main/scala/li/cil/oc/server/machine/Machine.scala

+    /** The computer is in the Secure state, and is currently synchronous. */
+    val SecureSync = Value("SecureSync")
+  }
+
  /** Signals are messages sent to the Lua state from Java asynchronously. */


I have some opinions about the naming of values here and the joining of two orthagonal ideas. I would prefer two separate enums, Insecure/Secure and Async/Sync
Also, I would prefer Untrusted/Trusted in place of Insecure/Secure
I would like you to genuinely consider this separation of enums as it would simplfy the code. Give some consideration for the name change as I feel it makes it more clear.

I understand this would cause quite a few changes to the way you've already written the code, but I feel it would make the correctness of the state changes more obvious.

This is good feedback. I will execute changes along these lines tomorrow.

In particular, Trusted/Untrusted is way better terminology for communicating what's actually happening than Secure/Insecure.

… 1 data card instead of requiring tier 2

SolraBizna · 2016-12-09T17:59:54Z

I made the requested changes. I also went ahead and added trustedSecret, since without it it's difficult to make a "secure bootloader" that's actually secure. (It was going to be part of the original PR, but I honestly forgot about it...)

asiekierka · 2016-12-09T19:12:14Z

I feel a better approach would be to sync the worker thread to the server ticks (even delaying the end of the server tick until all computer computation has been done, if necessary) instead of removing the multithreaded-ness - mainly for performance reasons, there's already far too much happening in the server thread.

Also, Secure Boot? Please.

SoniEx2 · 2016-12-09T21:54:53Z

Oh come on just use the time and yield-resume loops (good ol' queueEvent-pullEvent-checkTime-doStuff).

SolraBizna · 2016-12-10T00:46:33Z

I feel a better approach would be to sync the worker thread to the server ticks (even delaying the end of the server tick until all computer computation has been done, if necessary) instead of removing the multithreaded-ness - mainly for performance reasons

I'm not sure what you mean by "sync the worker thread to the server tick". With the current world model, there's no way to thread a TileEntity update while remaining perfectly synchronized with the rest of the world. At the very least, it introduces inconsistencies in timing between the computer's TileEntity ticking and the computer making decisions that affect the world, which is exactly what synchronous mode is designed to prevent. If your application can handle that degree of inconsistency, then it can almost certainly handle just running asynchronously.

To reiterate, this PR does not make all computers synchronous, and the computers it does make synchronous are not necessarily synchronous all the time. And every synchronous application I ever planned to write will have an "inner loop" that takes up less CPU time than your average IC2 crop block.

Also, Secure Boot? Please.

Your intended meaning is not clear.

asiekierka · 2016-12-10T00:53:15Z

With the current world model, there's no way to thread a TileEntity update

while remaining perfectly synchronized with the rest of the world. No, but you can move just the operations requiring thread safety to the server thread and get the kind of timing you want without negatively impacting performance.

To reiterate, this PR does not make all computers synchronous, and the

computers it does make synchronous are not necessarily synchronous all the time. And every synchronous application I ever planned to write will have an "inner loop" that takes up less CPU time than your average IC2 crop block. You maybe, but can you say that for every user of the mod?

Your intended meaning is not clear.

In a survival game, the idea of unhackable computers, inventories, machines, etc. terrifies me. 2016-12-10 1:46 GMT+01:00 SolraBizna <notifications@github.com>:

…

I feel a better approach would be to sync the worker thread to the server ticks (even delaying the end of the server tick until all computer computation has been done, if necessary) instead of removing the multithreaded-ness - mainly for performance reasons I'm not sure what you mean by "sync the worker thread to the server tick". With the current world model, there's no way to thread a TileEntity update while remaining perfectly synchronized with the rest of the world. At the very least, it introduces inconsistencies in timing between the computer's TileEntity ticking and the computer making decisions that affect the world, which is *exactly* what synchronous mode is designed to prevent. If your application can handle that degree of inconsistency, then it can almost certainly handle just running asynchronously. To reiterate, this PR does not make all computers synchronous, and the computers it does make synchronous are not necessarily synchronous all the time. And every synchronous application I ever planned to write will have an "inner loop" that takes up less CPU time than your average IC2 crop block. Also, Secure Boot? *Please.* Your intended meaning is not clear. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2168 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAG7apquuGPY5Dr-QXBjqtM9eEFFUR8Sks5rGfZrgaJpZM4LIjOO> .

SolraBizna · 2016-12-10T01:16:12Z

No, but you can move just the operations requiring thread safety to the server thread and get the kind of timing you want without negatively impacting performance.

I don't see how. I think I'm not understanding what you're suggesting.

You maybe, but can you say that for every user of the mod?

I most emphatically cannot. But the existence of fools does not mean we stop making dangerous tools.

In any case, it's not the users of the mod who are the real threat, it's the server admins. In order for any user to cause real problems with this PR, they have to be enabled to do so by the server admin. Providing this feature on a server is much like providing MystCraft on a server, except that the default configuration is safe, and it's much harder to reach a dangerous configuration by accident.

In normal use on servers, the admin would have a very small set of "safe EEPROMs" that are whitelisted, and be very cautious about adding any new ones to that list. (That's ignoring the majority of servers, which would simply never turn this feature on.)

In a survival game, the idea of unhackable computers, inventories, machines, etc. terrifies me.

Rightly so. But, consider:

How to hack a computer in current OC: Take the EEPROM out of the computer and put a malicious one in, optionally spoofing eeprom.get() to avoid detection. Computer hacked.
How to hack a post-"Secure Boot" computer: Take the EEPROM out of the computer and put in a malicious one in that spoofs computer.isSecure(), computer.setSyncState(), and optionally eeprom.get(). Computer hacked. (This is only necessary if a misguided admin went far out of their way to "protect" this computer; otherwise, the pre-"Secure Boot" solution still works just as well.)

"Secure Boot" isn't designed to protect computers, it's designed to protect worlds.

SolraBizna · 2016-12-10T01:24:34Z

No, but you can move just the operations requiring thread safety to the server thread and get the kind of timing you want without negatively impacting performance.

I don't see how. I think I'm not understanding what you're suggesting.

If I understand you correctly, I think you're suggesting having the "computer thread" running in its own Java thread, which receives signals and writes callbacks once a tick, on the tick. Is this correct?

SolraBizna · 2016-12-10T06:06:24Z

Upon understanding what you were suggesting, I recognized it as a better solution. It's a bit of extra overhead for very short inner loops, but it quickly becomes a big relief as inner loops grow larger. Instead of having to have a short inner loop, the synchronized computer now only need have an inner loop less than 50ms long to avoid causing world lag. Much more achievable. Unfortunately, in exchange, it causes a small amount of temporal decoherence.

Taking the current tick as T, the timing guarantees are now:

Pulling signals will pull every signal that was received before T, and possibly some between T and T+1.
Performing a synchronized call will take effect at exactly T+1.
Performing a direct call will take effect somewhere between T and T+1, exclusive.

I'm not positive that first one is good enough. I think there ought to be a stricter guarantee on signal receival. (Maybe synchronous computers can lock their signal queue until they yield?) The third one might also cause problems (on redstone inputs, for instance), but in practice such problems can generally be avoided by careful programming IFF a strict signal receival cutoff is added.

(Fiddly temporal mechanics note: tick T is actually a sequence point in the world's overall processing. When a computer experiences tick T, some of the world will already have experienced tick T, and some will still be waiting. This is okay in practice, as long as it's consistent.)

SolraBizna · 2016-12-11T15:26:08Z

With considerable effort, the "trusted secret" does enable "unhackable" computers. However, so do tier 2 data cards; unlike those, "trusted secret" requires admin intervention, and since it doesn't (itself) provide a good encryption scheme, it actually involves more work than the data card method.

bonnedav · 2017-06-16T15:05:08Z

Any update on this?

SolraBizna · 2017-06-16T16:29:37Z

I wasn't able to get it working reliably with the new approach, and unfortunately I don't have any more spare time to spend trying. (Meanwhile, OpenComputers has continued evolving and the original changes don't work anymore.)

payonel · 2017-06-16T18:42:24Z

Thanks for the digging into this and contributing @SolraBizna! This was an interesting area to explore

Admin-controllable synchronous execution support

bc5d05f

payonel requested changes Dec 9, 2016

View reviewed changes

SolraBizna added 3 commits December 9, 2016 09:42

Make changes requested by payonel (and close a potential security hole)

e8524c6

Oops. That was important.

96c3f2b

Add trustedSecret, so that bootloaders can be made secure with a tier…

503655b

… 1 data card instead of requiring tier 2

Each synchronized computer now runs in its own Java thread.

cb7ae9a

xarses mentioned this pull request Feb 10, 2017

RT mode #2269

Closed

payonel closed this Jun 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Admin-controllable synchronous execution support #2168

Admin-controllable synchronous execution support #2168

SolraBizna commented Dec 9, 2016

payonel commented Dec 9, 2016 •

edited

Loading

Vexatos commented Dec 9, 2016

SolraBizna commented Dec 9, 2016 •

edited by payonel

Loading

payonel left a comment

payonel Dec 9, 2016

payonel Dec 9, 2016

payonel Dec 9, 2016

payonel Dec 9, 2016

SolraBizna Dec 9, 2016

SolraBizna commented Dec 9, 2016

asiekierka commented Dec 9, 2016

SoniEx2 commented Dec 9, 2016

SolraBizna commented Dec 10, 2016

asiekierka commented Dec 10, 2016 via email

SolraBizna commented Dec 10, 2016

SolraBizna commented Dec 10, 2016

SolraBizna commented Dec 10, 2016

SolraBizna commented Dec 11, 2016

bonnedav commented Jun 16, 2017

SolraBizna commented Jun 16, 2017

payonel commented Jun 16, 2017

Admin-controllable synchronous execution support #2168

Admin-controllable synchronous execution support #2168

Conversation

SolraBizna commented Dec 9, 2016

Defaults

Secure Boot

Synchronous Operation

Custom Timeouts

payonel commented Dec 9, 2016 • edited Loading

Vexatos commented Dec 9, 2016

SolraBizna commented Dec 9, 2016 • edited by payonel Loading

payonel left a comment

Choose a reason for hiding this comment

payonel Dec 9, 2016

Choose a reason for hiding this comment

payonel Dec 9, 2016

Choose a reason for hiding this comment

payonel Dec 9, 2016

Choose a reason for hiding this comment

payonel Dec 9, 2016

Choose a reason for hiding this comment

SolraBizna Dec 9, 2016

Choose a reason for hiding this comment

SolraBizna commented Dec 9, 2016

asiekierka commented Dec 9, 2016

SoniEx2 commented Dec 9, 2016

SolraBizna commented Dec 10, 2016

asiekierka commented Dec 10, 2016 via email

SolraBizna commented Dec 10, 2016

SolraBizna commented Dec 10, 2016

SolraBizna commented Dec 10, 2016

SolraBizna commented Dec 11, 2016

bonnedav commented Jun 16, 2017

SolraBizna commented Jun 16, 2017

payonel commented Jun 16, 2017

payonel commented Dec 9, 2016 •

edited

Loading

SolraBizna commented Dec 9, 2016 •

edited by payonel

Loading