We are excited to announce the upgrade to MariaDB version 10 will occur during the August 7th compute grid maintenance window. As a reminder, the grid maintenance window is from 8am to noon.
This upgrade will provide a number of benefits to our users, in the form of optimizations, bug fixes, and new engines. The optimizations will result in faster queries for both simple queries and more advanced queries that incorporate ordering or unions. The new engine choices will open new possibilities with regards to workflow, such as moving your text-searches into
We have been working diligently on implementing a solution for fine-grained backup of research databases which will complement the standard nightly system backup. We have posted our first draft of the backup plan on the MariaDB section of our website, and we invite your comments. Also, please contact us
Later this month you should be receiving emails regarding software renewal update instructions if you are a HBS/Harvard licensed user of ArcGIS, Mathematica, Matlab, and SAS. Please be aware that these software packages will be expiring in July. Your cooperation in the renewal process is greatly appreciated.
Additionally, we will add the open source software RStudio v1.0, R v3.3, and Python 3.x to the compute grid in the coming months. If there are other software packages that you'd like to use on the Grid, please
Summer has arrived and many now have the opportunity to relax. But at the same time, for many this is a time to focus on research projects that have been on hold. For that reason, we ask you to be mindful of appropriate use of the compute grid resources:
Use an appropriate amount of RAM for your work. A good rule of thumb is to ask for memory that is about 10X as large as your file size + wiggle room. So if your dataset is 500 MB,
Using multiple cores (CPUs) to analyze data is an efficient way to get more work done in less time. But this is true only under certain circumstances. By default, R, Python, and MATLAB can only use one core even on a multicore (multiCPU) machine, unless you specifically program them to use more. Stata, on the other hand, has been parallelized, so many of its functions can use more than one core, but only to a maximum of 75% efficiency overall. To get the most efficiency, its best to run your 'do' files in batch; if using the interactive GUI, Stata spends