Dealing With Security Issues: The Security System

This document deals with possible security issues of XProc pipelines and introduces MorganaXProc's security management as a safety appliance for possible threats.

Overview:

What about these "security issues" of XProc pipelines?

As noted in XProc: An XML Pipeline Language there are possible security threats when running XProc pipelines on your system. XProc pipelines can contains steps, which may read arbitrary ressources either from your local computer or the internet ('p:http-request' and 'p:load'). With 'p:store' a pipeline may store data at any place on your system or the internet. Finally, 'p:exec' may execute any code on your computer, possibly without you even noticing it. All these are useful steps, enabling us to develop powerful pipelines with XProc. Having said this, let us imagine what might go wrong: Since the uri used in 'p:store' can be calculated in your pipeline, what happens if you do some miscalculation and set the attribute 'href' to name some vital ressource of your operation system, or your local calender or some other important data? They are gone. Or: You might have a pipeline extracting some public information for a customer from an xml document also containing confidential information. And then, by accident, you put the name of the wrong step into your 'p:pipe' statement binding the input port of 'p:http-request', sending the confidential data to your customer. And even worse: Since you may might also running third party pipelines, such a pipeline may read some confidential data using 'p:load' and then send it back to someone using 'http-request' with method 'post' or just 'p:store' it on some place on the internet.

Got it? Some things may go terribly wrong when using XProc pipelines either by mistake or intentionally.

How does MorganaXProc come up with these threats?

MorganaXProc's way to prevent XProc pipelines from doing harm to your data is to establish some kind of sandbox around the pipeline and give you control, which ressource outside the sandbox can be accessed from within. This is the purpose of the so called "security system" in MorganaXProc: Every time a pipeline tries to read a ressource (in the broadest sense of 'read') or tries to write to an uri (also in the broadest sense of 'write'), MorganaXProc will check its security system, whether the pipeline is allowed to do so or not. If it gets the allowance from security system, everything is fine and the pipeline can proceed. But if the access to the specific ressource is forbidden, MorganaXProc will end up pipeline's execution and throw a security error informing you, that some possible security issue was found. And of course: You can configurate the security system, telling MorganaXProc what is allowed and what is not.

Now, having painted the broad picture, let us go into some details. Suppose you would like to prevent a pipeline or even every pipeline from storing data with 'p:store' on your system. How do you do this? As you might (or might not) recall from the document about configurating MorganaXProc, a security control is added to the configuration by incorporating an element called "SecurityControl" in MorganaXProc's namespace. So we have to add an element named "SecurityControl" to our configuration in order to prevent pipelines from storing data. And for MorganaXProc needs to know what to prevent, we have to use the attribute "operation" in our "SecurityControl" element. As we want to prevent the pipelines from storing data, we give the value "STORE_RESOURCE" for the attribute "operation". The second thing you have to tell the security manager is, for what uris the security control should apply and put this into the attribute "path". Since we want no data to be written anywhere on our harddrives, we might use "file:" as value for "path", since every access to our harddrive will start with the uri scheme "file:". And thirdly we need to tell security management whether the SecurityControl expresses an allowance or a prohibition. This is done using the "allowed" attribute with either "true" or "false". The complete security control element to prevent a pipeline from storing data on our local system will look like this:

<SecurityControl operation="STORE_RESOURCE" path="file:" allowed="false" />

Pretty easy, isn't it? Of course you can be far more detailled with the uris to control, if you wish. A security control to prevent pipelines from storing data to a folder "confidential" in your user folder might look like this:

<SecurityControl operation="STORE_RESOURCE" path="file:///users/me/confidential/" allowed="false" />

As you can see, the attribute "path" takes uris (or uri fragment to be precise) as its value and you can use it to specify exactly what resources you would like the security control to apply to.

Now let us make things a bit more complicated. Image you still want to protect your folder named "confidential" but you need to store data from a pipeline in an xml document named "data.xml" with happens to be in this folder. With the security controls we have added so far to the configuration document, this will not be possible. If your pipeline tries to store data in "file:///users/me/confidential/data.xml" a security error will be thrown because of a security control. What we need here are two security controls, one forbidding access to the folder and one allowing access to the document in the folder. So your configuration file should contain the following lines:

<SecurityControl operation="STORE_RESOURCE" path="file:///users/me/confidential/" allowed="false" />
<SecurityControl operation="STORE_RESOURCE" path="file:///users/me/confidential/data.xml" allowed="true" />

How does MorganaXProc's security manager know, which rule to apply to a specific <p:store>? Does order matter? No! To determine whether a security relevant operation for a given uri is allowed or not, the security manager will take into account all security controls for the operation and then select the most specific control applicable to the uri in the step. So in our case storing to "data.xml" is not allowed because the control comes after the control set for the folder, but because the second control is more specific (the uri is longer) than the first one.

The only situation, the order of rules does matter, is when you specify two contradictory rules for the literally same path and same operation. So if you first allow path p to be read and then forbid reading from p, MorganaXProc's security manager will take the last stated rule to be operative.

The next thing, we have to talk about is the attribute "operation". What values can you give to this attribute and to which steps or general operation in an XProc pipeline do security controls with this specific attribute values apply? In other words:

In which cases do security controls apply?

RUN_PIPELINE

Security controls with "RUN_PIPELINE" as value in attribute "operation" are checked every time before MorganaXProc compiles an XProc pipeline supplied by a uri. This does apply when a pipeline is called from the command line interface and also, when the compiler is invoked inside a Java application.

IMPORT_PIPELINE

This value applies to XProc pipelines with a <p:import> element. Before the pipeline designated by the resolved uri is accessed, the security manager checks for an allowance to do so.

Please mind, that security control does only apply, when a library document is actually read. When the library is already known to MorganaXProc (because it is a build in library or because it is explicitly loaded by using configuration property "ExtensionLibrary") now security control is performed and hence no SecurityControl entry is needed.

READ_RESOURCE

With this value for attribute "operation" you control reading of data by an XProc pipeline. The security controls apply, before

  • an input port of a pipeline is bound using a uri (in command line interface or a Java application).
  • a ressource is accessed in a <p:data>
  • a ressource is accessed in a <p:document>
  • any document is loaded in <p:load>.
  • a <p:directory-list> is executed.
  • any schema is loaded resulting from location hints in a document in <p:validate-with-xml-schema>
  • any document is included in <p:xinclude>.
  • any ressource is imported in processing a stylesheet of a <p:xslt>.

When using steps from the Proposed File Utilities library, a security control is performed before

  • the resource designated by option "href" in <pxf:copy/> is read.
  • the resource for <pxf:head/> and <pxf:tail/> is accessed.
  • any information about the resource is obtained in <pxf:info/>.
  • the resource designated by option "href" is <pxf:move/> is read. (Additionally DELETE_RESOURCE applies for this resource.)
STORE_RESOURCE

This security control apply, before

  • any document is stored in the command line interface as result of an "output:" element.
  • data is written in <p:store>
  • data is written in <p:xsl-formatter>
  • the resource is copied to target uri using <pxf:copy/>.
  • the resource is moved to target uri using <pxf:move/>.
  • a directory is created using <pxf:mkdir/> or <pxf:tempdir/>.
  • a temporary file is created using <pxf:tempfile/>.
DELETE_RESOURCE

This security control controls whether the pipeline is authorized to remove a resource using

  • <pxf:move/>
  • <pxf:delete/>
LOG_PORT

This value for attribute "operation" marks security controls to check before any data is written as result of a <p:log> step.

EXEC_COMMAND

With this value for attribute "operation" you can establish a security control applying every time before the command in a <p:exec> step is called. Please mind that the security control does only apply to the command and can not select certains parameters of the command.

HTTP_GET

These values establish security controls for a specific method of <p:http-request>. By using one of this values you are able to specify sophisticated security controls for pairs of uri (fragment) and http-methods: For example you can establish a control allowing "get" for a specific uri and a second control forbidding methods "post" or "put" for the same uri.

Please note that not all of the http methods listed here are supported in MorganaXProc's default implementation. (In fact the methods relating to WebDAV are not supported.) But as you may enhance the filesystem used by MorganaXProc, more methods than actually supported are provided anticipatory.

HTTP_POST
HTTP_HEAD
HTTP_PUT
HTTP_DELETE
HTTP_PROPFIND
HTTP_PROPPATCH
HTTP_MKCOL
HTTP_COPY
HTTP_MOVE
HTTP_LOCK
HTTP_UNLOCK
ALL_OPERATIONS

... is a handy way to establish security controls without stating one for every possible value of attribute "operation".

ALL_HTTP_OPERATIONS

... is a short cut for all HTTP_xxxx values so you do not need to repeat the same uri for every http method.

Additional to this access oriented controls, there is one more security control called "JavaLoadAllowed". With this setting you can control whether a pipeline can declare an atomic step by connecting the step with a Java class. Allowed values for attribute "value" are "true" or "false", which means that either an atomic step can be declared with a Java reference or that this is forbidden. See documentation of customized steps for this feature of MorganaXProc. The default setting for this control is "false".

Having seen all this different operations for which security controls can be established, one question may have appeared to you:

Do I have to allow wanted or forbid unwanted operations?

Well it is up to you as you can do both. The question is, what the security manager will do, if an operation for a uri is found without any related security control. Will it let the operation pass and go on with executing the XProc pipeline or will it throw a security alert? MorganaXProc's security manager comes in two flavors called "liberal" or "authoritarian". For a liberal security manager every operation is allowed except the ones stated to be forbidden by the found security controls. The authoritarian security manager takes thing the other round and takes everything to be forbidden except the ones explicitly allowed by security controls.

You can choose which one of the two strategies the security manager is using setting property "SecurityManager" in a configuration document. Which strategy is used is determined by the value of attribute "strategy" with can be either "liberal" or "authoritarian". Please mind that this will establish a new security manager without any security controls. So here is a case where order matters in a configuration documentation: First state which security manager flavor you would like MorganaXProc to use and then state the security controls for the new manager. The other way round will not work. All controls will be ignored if specified before element SecurityManager because technically the controls are added to an existing security manager that is then abandoned.

Now another (the last, I promise) question may come up: What will MorganaXProc do out of the box, if I state no security control at all and have no security manager in my configuration document? Or:

What are the default settings for security?

MorganaXProc will by default use an authoritarian security manager forbidding everything, except:

  • running and importing XProc pipelines from uris using scheme "file:"
  • reading ressources from your user's home folder and its descendants accessed either by a relative uri or a uri with scheme "file:"
  • doing http requests with method "get" for all uris (independent of the used scheme).

That is all, so by default MorganaXProc has pretty limited access to ressources either on your computer or on the internet. It is restrictive, but is also pretty safe.

Now we are done with security issues. Let us go on and see, how to enhance MorganaXProc with third party software.