security.podzone
Project: IP: Privacy Level: Fee Status:
Project Neuron Copyright free for public use Public Project Pro Bono

Synopsis: To help identify aspects of BOINC projects that, in the opinion of the project participants and project administrators, could be improved with a view to providing public project participants with a safer, smoother, more productive and more reliable experience of volunteer computing. If time and resources permit create prototypes to demonstrate ideas and collect data that can act as an indicator for users to help them choose projects to join.

Status: Ongoing. This page will be changed from time to time.

Alerts/Disclaimer: Whilst every effort has been made to gather accurate information your use of information from this assignment is entirely at your own risk. NO liability whatsoever is accepted in any circumstances for anything contained on this or the pages referenced.


Below are opinions. They are intended to help identify areas of BOINC that "could" be improved through  development/enhancement and/or they are offered as hopefully helpful advice to the BOINC developers and to projects.  These suggestions and ideas are based upon participant feedback, experience and project research within Project Neuron. The list is added to as new ideas and thoughts come along. The list is updated if ideas are taken into use. There is no particular order the suggestions. Not all ideas will be great nor will they be feasible - but on the basis all suggestions are welcome the following are submitted.

Ref Potential areas for further development Status: Notes Next Steps
1 A specific protocol for server availability sites to question the scheduler about its status. (Sept 06)

See also suggestion 7.

Ongoing Many uptime and status sites seek information from the project web site. These queries are not structured and are often incomplete HTTP protocol requests that cause the web server to alarm or protest. It is hard to distinguish these semi-legitimate requests from a "hack" attempt. It is suggested that the scheduler specifically handle such requests in a more structured manner. This will prevent "hack" attempts being misread as semi legitimate status requests and vice versa.

A prototype is now available for testing and feedback. A XML feed is available with a number of trial data items. A 2 second delay has been injected into any responses. Times are BST - need to fix that. Feedback please.

A collation site could take all project outputs of this type and create a performance database. This would help guide users on whether projects are well run. This idea is a "primary" output from Project Neurons work.

Prototype testing.

Waiting for feedback.

         
2 A means for an admin to add new and change the most recent news item as records in the project's database.  (Sept 06)

Provide a data stream such as RSS that has meaningful tags that can be interpreted correctly. (Sept 06)

Final The addition of news is cumbersome. News items should be stored in the project database and a web based means (for Ops) to make changes and add new items be made available. This would ease the editing of raw HTML which can be error prone and thus avoid mistakes that might lead users to be confused.

A measure of a projects success, in the eyes of users, is how often the project shares information with them. Admittedly this is a volume measure rather than a quality measure but it helps. Making meaningful and accurate RSS tags available in news feed  responses will aid the automatic checking of "project communication with users".

Whilst it may be argued that this would increase DBMS activity unfavourably it may well be the case that the information would be cached in any event and thus be no burden. Alternatively the same caching mechanism could be used to generate the feed periodically as happens elsewhere with BOINC web pages.

Final Recommendation

         
3 A means for admins to know what the current log position is plus tools or methods to cycle logs properly including archive if necessary. This should apply to all data produced regularly e.g. stats data. (Nov 06) Final With the exclusion of stats data, which needs separate treatment, this work has completed and there is advice at Spy Hill as well as here at security.podZone.

 This project and the Pirates project worked together to create and test a config for log rotation using standard tools. You can obtain a version used for Neuron's log rotate config file here if you wish. You can get the project config.xml entry to run log_rotate periodically here. There may be variations on this that projects wish to pursue but as a core its adequate.

This development will ensure project log files do not overrun the system's available disc space. This helps avoid  a system failure.

Final Recommendation

         
4 An automated means of balancing current demand with the period a client is told to "stand off" before asking for more work. This is set statically at the moment and but could be "intelligent" and vary between some min and max based upon demand. Demand changes due to outages or network problems etc and this would address some aspects of the deluge of connections often seen after such an event. (Dec 06) Further research needed The scheduler will ask clients to stand-off for a defined period of time as established in the project's config.xml file. This is a completely arbitrary period which rarely changes even though project circumstances change. If projects are maximising the work undertaken by clients then this period needs to be better optimised to what is happening for a project in a more real-time like manner.

It is suggested that stand-off periods are not determined statically but rather they are more dynamic. So if a server is not busy shorter periods will be sent to clients and vice versa rather than blindly following the config.xml setting.

Ongoing

Examine how "busy-ness" can be established and over what period of time this might be done.

Establish how the scheduler might learn about a new value for stand-off.

         
5 A suggestion has been made by this project for volunteer supporters and this has been accepted in principle and an implementation made for text based and voice based support. (Dec 06) Final A helpers group has now been established. It may be that a structured database be established of problems along with instructions to follow. This would be for use by the helpers in giving advice and assistance.

Final Recommendation

         
6 The use of MRTG to show network performance is helpful to users and sys admins. Seti@Home does this and tests on this project reveal it to be a useful thing to have. (Jan 07) Ongoing Project Neuron has implemented this on a test basis and it functions well. Standard SNMP setups have been implemented albeit there was a fair effort in getting th test config file right. It is not currently in operation since the rebuild of the Neuron system to FC6 but it can be seen at SETI.

The demonstrator is here. Do see the SETI at Home one too though.

As an example of the help it can provide: This helps a sys admin see failed TCP connections and average  TCP connections made over time. This informs what tuning might be needed either of web servers or perhaps of network capacity or perhaps the rate at which work units should be generated and given out to avoid bottlenecking.

Re-implement demonstrator - done
         
7 The use of SNMP MIBs to represent project, boinc client and application vital/important information (including performance & status) and to make this available within a SNMP environment to monitoring systems.  (Jan 07) Ongoing This is currently undergoing further research. Following development these MIBs can be "registered" as it were as standards for such data extraction/setting in the future. This will allow system admins and users to have access to data through the use of standard protocols. Development of BOINC and/or app  MIB
         
8 Projects should create a second test project to test server side changes and test new clients or work unit changes prior to live running. This can be achieved on the same web server and MySQL database instances; no need for more hardware. (Jan 07) Final This is very worthwhile. Having a second project, even on the same server, gives excellent opportunities to test new releases and changes  before making that leap on production systems. Final Recommendation
         
9 Apache Server Security notes (Feb 07) Constant revision Commentary on additional security for an Apache server in the context of a BOINC project. These notes are under development and are based upon the experience of having operated a BOINC project. Ongoing

Continue to tighten project security and report back.

         
10 Apache Server mod_security  (Mar-Apr 07) Constant revision The release of modsecurity2 (3 April 07) brings quite a lot of what the project modifications to the conf file did. At the moment the standard FC6 config is in use on Project Neuron except for the addition of disguising the server type.

If you are still version 1 the text file below is far better than the default. If you are version 2 the the new default looks promising subject to detailed checks.

For mod_security version 1 users a text version is available here. (Right click and save as)

Ongoing

Continue to tighten project security and report back.

         
11 HTTP requests from stats, uptime, availability sites et al. (Mar-Apr 07) Constant revision Stats and uptime sites send a variety of requests to Apache. Some are badly populated with missing headers. A sample of more acceptable code ( PHP in this example) that will not cause Apache to throw alerts is shown here at the project site. Developers are asked to meet this standard with stats, uptime and availability requests.

This is in addition to suggestion 1 above.

Ongoing

Have shared with some uptime and stats sites. Awaiting responses from others.

         
12 Integrity of application installation (Mar 07) Yet to start There is a feeling that the use of BOINC as a means of delivering a Trojan needs to be explored and protected against. Whilst that thinking is ongoing this project will consider how a project's application can establish if it is legitimately installed and deny service if it has not been installed through the proper project joining process BOINC provides. A potentially complex problem but worth some thought to see if additional protection will help protect reputation. Ongoing

Establish an application security framework and test against BOINC.

13 Scheduler log file entries (Apr 07) Ongoing BOINC produces log files to show system activity. The scheduler log file shows normal events plus those considered out of the ordinary. The latter class of event is marked as critical - e.g. non BOINC client access to the scheduler, config.xml tag not recognised. A further log should be produced that has those critical events logged to it (instead or) in addition to the normal log. This will make identification of a critical events possible rather than just plain excessively difficult. Monitor
14 Removal of orphaned download and upload files. (April 07) Completed During testing Project Neuron carried out consistency checks between WU and RESULT files on the server and DBMS entries for known and active work available/returned. There appear to be circumstances where old upload and download files remain within the Linux/*nix file system and are no longer represented in the project database. Whilst there is an option to remove old upload results with the standard file deleter there does not appear to be an option to do the same for orphaned download files.

This project has created a PHP script to achieve this and to make sure upload files are removed too. This can be found here. (There are no warranties or guarantees with this script - but it is in use on Project Neuron). Read the header of the script carefully before proceeding.

This script uses the projects config.xml file. It looks for certain tags and if found can then proceed. It needs to be configured before running and adequate documentation is included at the head of the script. This script is in use on this project

Updates as and when required.
         
15 Stats directories build up over time and cause filestore shortages on busy systems. (May 07) Completed Logs are generally well provided for if log rotation is employed (see earlier). A neglected area is the production of stats.

Each time stats are produced they are placed in a directory named after the date and time of the production run and a copy of the actual gzipped xml files placed in the stats directory available to the public. This replaces anything there previously so that the stats directory only has the latest files therein. There is a build up of stats directories as a consequence that are not dealt with by the BOINC system. This uses valuable disc space which over time could cripple a BOINC based system

A PHP script has been produced and tested that could be used by any BOINC project. This script is in use on this project. See the script here.

Final Recommendation

Go to Main Page for security.podZone