Product Feedback

  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback
  1. Small Cluster instances for reading files

    A big pain point of spark / databricks is reading millions of small files. Unfortunately, that's a common scenario. It's possible to read those when spinning a lot of workers at the same time but unfortunately, it's also quite expensive. It would be great to have a cluster type with really little ram just for reading all those files. Afterwards is possible to perform a coalesce to pack them into bigger parquet files for further processing with a bigger cluster and less worker nodes.

    2 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  2. Allow for non-GPU workers with a GPU driver

    For single node-GPU (ML/TensorFlow/etc) work, it can be useful to have Spark running and aggregating work on the driver while not doing any GPU work on the workers. Thus it would be great to be able to mix and match a cost-effective setup that doesn't have GPU's idling on the workers.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  3. Support the new us-west-2d availability zone

    us-west-2d has consistently low spot pricing for some instance types. Please add support for it.

    4 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  4. Autoscaling (AWS) Local Storage supported SSD

    Currently autoscaling local storage only supports Throughput Optimized HDD. We currently use SSD for local storage due to the greatly increased job performance, but this leads to a lot of overprovisioning and unused SSD. Would be great to be able to use the Autoscaling local storage with SSD.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  5. Turn auto-termination on/off without cluster restart

    We want the cluster to be always on (auto-termination disabled) between 9AM and 5PM. After this time cluster should be on demand (auto-termination enabled). This could be realized either via API call or as a schedule defined for the cluster). At the moment, changing this auto-termination setting would require cluster restart (thus killing all jobs on the cluster)

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  6. Support GPU clusters (g3) in North California region

    When we try to create a cluster with GPU instances, we see that the GPU instance types available are p2 and p3, which are not available in our current installation region (North California).

    For us-west-1 region that is the current account the only GPU instance types supported by AWS are the g3 instance types.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  7. Azure - Make NCv3 available in North Europe

    Currently, only NCv1 (K80 GPU's) are available for GPU accelerated ML workflows, these are excruciatingly slow and small compared to NCv3 that run V100s.
    We have all our data in North Europe and are therefore stuck with K80s, putting us at a major competative disadvantage.
    Futhermore, a lot of frameworks have cumbersome multi-gpu training workflows, so it is preferable to use one larger GPU over several smaller ones.

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  8. Live updates to Spark UI

    Right now I have to refresh the page to see the updated data, would be great if its live.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  9. How to start cluster in trial account?

    Trial account only allows for 4 cores. Yet with the new machines the driver takes 4 and a single worker takes another 4.
    Any ideas on how to start a cluster on the trial account?

    16 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  10. Azure Databricks mount points mounted on cluster level

    As far as I understand this is already an option on AWS. I would like to arrange mount point access on a cluster level instead of on a workspace. This helps me to secure data inside the mount points.

    10 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  11. A metrics snapshot was saved at job end/termination

    Take a final screenshot of the ganglia cluster metrics.

    It looks like every 15 minutes a screenshot of the ganglia metrics is recorded. For jobs that run in fewer than 15 minutes there are no metrics snapshots to view. If possible could a snapshot of the metrics be taken at job end?

    6 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  12. Databricks sends a failure email only when all retries fail

    At the moment, Databricks is sending a failure email every time a job is failing.

    It could be better to have an option to send an email only when all retries of a job failed.

    The reason for this feature is that no action can be taken when a job is failing. The only action consists to go to the job dashboard to notice that Databricks did well by relaunching the job. However, if the failure email was coming only when all the retries fail, a manual action would be needed by us.

    In our case, we're observing very regularly…

    42 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  13. 6 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  14. IAM role-based S3 access documentation should mention details for KMS-encrypted buckets

    Documentation regarding IAM role configuration for Databricks should indicate that the role/databricks you create (which provides access to S3 buckets) must be added as a Key User of any KMS keys used to encrypt those buckets.

    Otherwise, the clusters will receive AccessDenied 403 on GetObject, regardless of the IAM role permissions. That error is very confusing for users in this case, as it does not indicate that the role needs to be added as a Key User for any KMS key used to encrypt the bucket.

    8 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  15. Support for *5 aws family instances

    AWS has release 5 EC2 family while ago and their performance/cost relationship is considerably better than the 4 EC2 family. When do Databricks intend to support this new EC2 Family?

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  16. One could spin up multiple clusters running the same notebook but with different arguments

    This would be extremely useful when tuning hyperparameters for a grid search in a machine learning pipeline e.g. for ALS matrix factorisation. Azure HDInsight has this option but it is a bit convoluted to set this up. I'm sure databricks could do it in an easier user-friendly manner. This is essentially the ability to launch several jobs at the same time, running on separate clusters, which automatically spin down when each job is finished and output their results somewhere.

    12 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  17. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  18. Stop the cluster at the end of big computing

    I propose an checkbox on the "run all" button to stop the cluster at the end.


    • This checkbox can be set after sending run.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  19. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
  20. Jobs could be assigned to run on the default cluster.

    We have a set of jobs that we've assigned to run on the default cluster. If the default cluster gets terminated, those jobs al have to be manually set to run on the new default cluster. Would be great to be able to set jobs to just use the default cluster, whatever it is.

    2 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Facebook Google
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Cluster management  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1
  • Don't see your idea?

Feedback and Knowledge Base