"Run all" should reset scope, similar to how detach+attach does.
This would make it easy to "validate" the notebook works correctly and to clear the slate after some manual fussing around.13 votes
Notebook ‘Clear’ menu has new options to clear notebook state and run all cells. See below for documentation:
You can print notebooks, and many browsers allow you to print to PDF.
customer suggestion1 vote
This is part of Databricks REST API v1.1.
When reconfiguring clusters I exceeded the AWS instance limit and no error messages were returned. If AWS Exceptions could be returned and the reconfiguration terminated more quickly it would be appreciated.
I've noted from a response that this might already be in the works.1 vote
This has now been improved. Such exceptions now appear if you hover over the exclamation mark on the clusters page.
Render more elegant Markdown, using the most popular implementations on the web as inspiration (e.g. github)1 vote
The markdown has been reimplemented and improved. The fonts have changed and improved. Furthermore, the rendering now fully happens on the client side. That is, you no longer need to be attached to a cluster for markdown changes/rendering to work.
Right now, you can collapse & expand only the entire commands. It would be great if you could collapse the code and the output tables and charts separately.
Additionally, a collapse all would be cool.
This way, you can easily switch the notebook between the "inner workings" and the "report" modes.
I believe Zeppelin does it like this.2 votes
This is now possible. The dropdown menu for each cell contains Hide/Show Code & Hide/Show Results.
- Have history of clusters (name, size, etc) after termination. Ability to clone.
This is now available on the clusters page. The Spark UI link lets you access historical clusters.
I have a large CSV file with column headers in the first row. When I try to import it as a table all my columns are of type "STRING" and my first row contains headers.
Would be good to be able to specify that first line in a text file contains column names.2 votes
CSV files often use quotes to protect embedded commas and the like.
Please provide an option to do quote processing.2 votes
This works for local files uploaded to Databricks Cloud (DBC). In Q1 2015 we will also add support for it for S3 import.
Support importing data from external databases, similar to the way one could import s3 buckets - I would personally love support for MongoDB and other NoSQL stores. I'd also consider making it easy to fetch data from Freebase etc.4 votes
Have an icon (like an exclamation mark?) with the other cell icons. This would run the cell. Can't run individual cells if I use the iPad to browse.1 vote
This has been implemented and is available on mobile devices. A “play” triangle icon should appear on every active cell. Clicking it runs the cell.
We need IDE support (or at least support for syntax highlighting / auto-completion / auto-indentation) for usability and ease of development / test.19 votes
. It is a little hard to tell where you are in re-running the cells of a notebook. iPython gives you numbers when a cell re-runs which is helpful.
. Cell feedback is confusing as at times the progress bar reruns multiple times
. Capability to comment entire cells with one click7 votes
Many of the Jupyter keyboard shortcuts have been added and are now available in Databricks. You can figure out what the shortcuts are by clicking on the keyboard icon on the context-bar of any notebook.
For companies using Google Apps as their central user management, it would be cool if there was no need to create/maintain/delete separate accounts in Databricks cloud.9 votes
We now support SSO through SAML 2.0.
Allow users to set export limits and remove existing default limit for CSV exports9 votes
The default limits have been improved. We are also reviewing making the limits configurable.
To be able to manage files on the cluster: including delete and download.7 votes
Provide fine-grained access control (e.g., sharing of individual notebooks, edit and view-only capabilities, sharing with others outside of an organization)11 votes
Administrators can now enable access control by clicking “Accounts” and clicking on the security tab where this a “Enable Access Control” button.
Provide version control for notebooks that allows for easy viewing and reversion to earlier versions.20 votes
Version control is now available in the professional tier (Databricks 1.4.1). Please try it out and let us know what you think.
Sorry about the earlier typo. Versioning is indeed available.
Be able to install custom libraries (in addition to just Java and Python libraries)6 votes
We now support running a script on every node of a cluster when it’s launched. That script can then run arbitrary shell commands, e.g. download and install arbitrary libraries. The script has to be located in DBFS (Databricks FileSystem). Please contact a field engineer at Databricks for help to set this up.
- Don't see your idea?