Product Feedback

  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback
  1. Add Multi-Factor Authentication (MFA) to login

    Adding Multi-Factor Authentication (MFA) to the login would be nice.

    20 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Other  ·  Flag idea as inappropriate…  ·  Admin →

    This is now available through most identity providers, which you can connect to using SAML 2.0. This is how most of the Databricks customers are doing MFA. Please let us know if you still would like to have MFA outside the SSO/SAML connectivity.

  2. Cluster Activity Log

    I'd like to be able to see when clusters were shut down or brought up and by whom.

    7 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  3. notebooks were stored in a git repo I could access from outside the cluster.

    My preferred method of offline editing would involve cloning a git repo, and pushing my changes when I got back online.

    13 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Notebooks  ·  Flag idea as inappropriate…  ·  Admin →
  4. Show which notebooks are running specific commands

    It would be nice to find which notebook ran a specific job on a cluster

    22 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →

    It is now possible to see in the detailed clusters page which notebooks are actively running vs being idle. It is also possible to go into a notebook, click Schedule and see the list of jobs that are using that notebook. If you have additional use cases you’d like us to cover w.r.t. insight into what’s running, please contact us.

  5. easy way to see all attached notebooks and detach some of them

    We had a lot of notebooks attached to our default cluster. Detaching them was tedious. A list that eliminated all the navigation would have been nice.

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Notebooks  ·  Flag idea as inappropriate…  ·  Admin →
  6. Summary Statistics for MLlib Models

    It would be great to see summary statistics for MLlib models, specifically around the significance of features (e.g., the lm summary table with p-values in R, or feature importance in Random Forrest). It would also be nice to have a one click plotting for each of the features so you could see what the individual features look like versus the dependent variable (so you could assess for transformations or other manipulations, or drop features from the model).

    I know some of this doesn't even exist for Spark MLlib so I guess it's a request for both.

    4 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Notebooks  ·  Flag idea as inappropriate…  ·  Admin →

    We now have support for SparkR notebooks in Databricks 1.4.1. That should give you access to summary tables and other statistics. You don’t need to translate all your notebooks to R, just save the tables you want statistics on and use an R notebook to get statistics on those tables using SQL. Please try it out and let us know what you think.

  7. UI notification when spot instance prices are high

    Lately I've been wasting lots of time trying to request for a spot instance cluster - apparently on those days when the spot instance price is high, I'd have to wait for a long time only to see that the request has failed.

    Even though there're ways for me to check the spot instance prices, I suspect that it'd help lots of your customers like me to have a simple UI notification when the spot instance price is high, and when it's recommended that I request for an on-demand cluster - followed by another notification when the spot instance prices…

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    completed  ·  1 comment  ·  Navigation UI  ·  Flag idea as inappropriate…  ·  Admin →
  8. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Visualizations  ·  Flag idea as inappropriate…  ·  Admin →
  9. 8 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Navigation UI  ·  Flag idea as inappropriate…  ·  Admin →

    Happy to announce that our third most requested feature has been released. You can now browse all databases in the UI as well as all the tabes inside those databases. Thus, you can organize your tables into different databases.

  10. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    completed  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  11. Add scroll to attaching notebook on chrome, if the list is long you need to use cntrl + f.

    Add scroll to attaching notebook on chrome, if the list is long you need to use cntrl + f.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Other  ·  Flag idea as inappropriate…  ·  Admin →
  12. Window functions in Spark SQL

    Add windowing functions to Spark SQL such as lead, lag, first_value and last_value.

    11 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Other  ·  Flag idea as inappropriate…  ·  Admin →

    This was just completed for Spark 1.4. This version of Spark is still being QA:d but we’re hoping to have an early preview of it in Databricks Cloud before the official 1.4 release.

  13. Write to S3 using Server Side Encryption

    We need to store some data in an encrypted form. We normally use S3's SSE. It seems that we cannot write from Spark to S3 using SSE. Supporting that would be helpful.

    9 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Data import / export  ·  Flag idea as inappropriate…  ·  Admin →

    This feature is now finished. We support SSE-S3 and SSE-KMS. This is available in Databricks 2.2, which is going out in this/next week. One can mount directories using dbutils.fs.mount by specifying the desired SSE-based encryption method.

  14. clusters could be created programmatically

    in order to create a fully automatic data pipeline where Spark is just one step it is necessary to spin up a cluster, run some Spark job (i.e. Notebook) and terminate the cluster by some API. the web UI is quite limiting.

    5 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    completed  ·  1 comment  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  15. Cluster configuration templates

    We use a handful of different cluster configurations (eg 250 gb on demand, 100 gb on demand, 1 tb spot, etc) to run different operations. In the clusters tab there would be 2 areas - live clusters and cluster configurations. If I want to spin up a specific cluster that I've already configured then I can just click a create button for the configuration and it will spin up in the live cluster area.

    This feature will become more useful with job scheduling so I can map each scheduled job to a cluster configuration so the correct size cluster is…

    9 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →

    You can now re-start terminated clusters by clicking on the “play” button on the clusters page. That way, you can keep all your previous parameters and use previous cluster settings as templates. You can also clone a terminated cluster by clicking on the copy button on the clusters page.

  16. I could have a way to programmatically control the plot that comes out of the DataBricks interface.

    For example, if I want a different graph for each breakdown category, I need to do the query analysis, then set up the plot by manually configuring the fields in the plot. A great thing about matplotlib is that you can programmatically determine what type of graph you want and replicate it as necessary by code.

    2 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Notebooks  ·  Flag idea as inappropriate…  ·  Admin →
  17. Add a cell in Notebook

    How do you add a cell in a notebook (somewhere in the middle)?
    There's a delete button. Where's the add button?

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Notebooks  ·  Flag idea as inappropriate…  ·  Admin →

    Just move the mouse cursor in-between two cells and a “+” symbol should appear; click then and a new cell should be inserted between those two existing cells.

  18. Stop polluting our root

    WIth every deploy arrives a new databricks_guide into the root, and now also learning_spark_book. Can you instead look where the previous folder was and replace that instead, so we can move it elsewhere and don't have old versions lying around.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →

    Jaka, thanks for this suggestion. Those folders are now special folders with their own icon and treated separately by the system. They should no longer pollute your root. Thanks.

  19. Create Table: Add ability to preview varying data formats - JSON etc

    Expand the potential number of file types that can be imported via the web UI when creating a table e.g. JSON. Currently it appears to be text delimited files only.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Data import / export  ·  Flag idea as inappropriate…  ·  Admin →
  20. "Run all" should reset scope

    "Run all" should reset scope, similar to how detach+attach does.

    This would make it easy to "validate" the notebook works correctly and to clear the slate after some manual fussing around.

    13 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Dashboards  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Feedback and Knowledge Base