# New Tips from the RCS Stats Team

December 7, 2018

We are often asked about how to calculate marginal effects in R, especially from Stata users who use Stata's margins and marginsplot commands after regression models. These two packages in R have similar functions to Stata's margins and marginsplot commands, which are used to calculate marginal effects after a regression model and graph them:

ggeffects

...

# New Tip from the RCS Data Team

December 7, 2018

Natural Language Processing (NLP) assists computers with processing and understanding natural human language, such as speeches, tweets, and newspaper articles. NLP can range from counting the number of times a word appears in text to analyses that assess attitudes (e.g., positive, negative). NLP can be conducted on a variety of platforms, including the robust NLTK package in Python and several libraries in R.

For an introduction and hands-on experience using the NLTK in Python, DataCamp provides a free module as part of their NLP fundamentals course:...

# New Google Search Engine for Open Data

November 28, 2018

# Update to R on Grid 2.5

November 16, 2018

Thanks again for being an early adopter of the new compute grid. As we’re about to open officially on Monday, we would like to take this opportunity to update the central R / RStudio installation for a number of reasons. We would do this work early in the day on Monday, and it would upgrade R to 3.5.1, from 3.4.2. (Note that this update would require an update/reinstall of any R packages that had been compiled.)

Thanks,

RCS Staff

# HBS Compute Grid 2.5 is arriving Monday 11/19!

November 9, 2018

Greetings! We are happy to announce the opening of HBS Compute Grid 2.5, beginning Monday, Nov 19th.

Based on both research computing trends and recommendations from the HBS research computing environment assessment, Research Computing Services (RCS) has been working closely with HBS IT to make improvements to our local compute grid. This new compute grid, v2.5, provides the following updates and...

# Grid Maintenance - November 2018

November 2, 2018

Dear Compute Grid User,

This is a reminder that the next monthly Compute Grid maintenance is scheduled for Monday, November 5th, from 8am until 12pm.

How will this affect...

# Compute Grid 2.5 Update

October 25, 2018
In our previous newsletter we announced our new computing environment, Grid 2.5, with an anticipated release of this fall. The new compute environment provides:
• Improved compute capacity through more hardware and better using of existing hardware
• Fewer restrictions on compute capacity while preventing CPU spillover
• Newer OS and software versions, and improved usability
• More software titles, including GitKraken (for version control) and Spyder, a Python IDE
• Better command-line submission scripts, to improve...

# New Tip from the RCS Data Team

October 25, 2018

The EXPLAIN statement is a very useful tool in SQL databases to help users better understand what's going on in queries and where to apply tweaks. For example, the output of EXPLAIN can help you decide where to add indexes and can quickly remedy slow queries by telling you the join type, the possible indexes to choose vs. the index actually chosen, the estimate of rows to be examined, etc.

How do you use EXPLAIN?

Simply put the keyword EXPLAIN in front of the query to be analyzed. EXPLAIN can be used in front of a query beginning with...

# Grid Maintenance - October 2018

September 27, 2018

Dear Compute Grid User,

This is a reminder that the next monthly Compute Grid maintenance is scheduled for Monday, October 1st, from 8am until 12pm.

This maintenance is a bit of a departure from previous ones as we...