"Recent History" for clusters would show system activity (like cluster terminated due to spot price being too high) along with user activity
"Recent History" for clusters would show system activity (like cluster terminated due to spot price being too high) along with user activity.1 vote
We added the cluster activities inside cluster event tab (https://docs.databricks.com/user-guide/clusters/event-log.html#event-log) and plan to deprecate the Recent History for clusters.
Let us know if you have any concerns.
selecting notebooks didn't automatically hide the file structure (makes scrolling through notebooks impossible)
Right now, scrolling through notebooks is impossible since the file structure automatically is hidden every time you select a notebook. Can we at least have it be an option for customized UI settings? This really makes me steer away from using the environment overall.1 vote
You can pin the file browser. There is a small pin icon at the bottom right of the left menu. Once it’s pinned, you can browse on different notebooks and they’ll preview without closing the file browser. Please let us know if this helps.
chapter 2 notebook shold have right path: "val lines = sc.textFile("file:///dbfs/learning-spark-master/README.md") // Create an RDD ..."
Learning Spark Chapter 2 Examples in Scala
Example 2-2. Scala line count
should have right path
val lines = sc.textFile("file:///dbfs/learning-spark-master/README.md") // Create an RDD called lines
Because this was executed from Initial Setup in the Overview notebook when the following was evaluated:
os.system("cd /dbfs; rm master.zip; rm -rf learning-spark-master; wget https://github.com/databricks/learning-spark/archive/master.zip; unzip master.zip;ls")1 vote
Thank you. This will be fixed in the next release of Databricks.
Without a date/time stamp its hard to tell when an idea turned into a planned item or when a response to feedback was submitted.1 vote
Unfortunately this is a limitation in uservoice portal software. Note however that you can get the list of the newest suggestions:
GB*hours by user by month.1 vote
accounts.cloud.databricks.com now shows the variable usage (GB*hours) for every month.
If you enable the GitHub integration option at the basic account level service, it would help us feel safe that we can invest our time and energy into developing our solution on DataBricks. As we grow we will move up to the pro level account. But right now we are unclear if our work is backed up and safe. This should not be a cost to you at all. So please please provide. Thanks.3 votes
All databases are running distributed with a replica in another datacenter. Furthermore, we do database dumps every day and upload/save to S3, which has eleven 9 durability.
DBC has a login screen where it is possible to access it via non-SSL. First it redirects you to ssl, but then once you have the jsessionid, you can request the non-SSL page.
This sets off warnings and alerts when our security team does an audit of our publicly reachable servers.
The request is to please always restrict the login screen to SSL.3 votes
This is now going out in Databricks 2.2, which is being released this coming week.
Add a Search text field to find threads in the Forum.1 vote
This is now possible. Inside Databricks, click on the question mark at the top right and then enter your search term where it says"Search Guide & Forum". This will search all the forum posts and give you links to the posts.
We have a lot of libraries loaded and they fill the workspace pull-right. Also, the same library name can appear multiple times. It would be great if there were some more logical and informative display of the libraries.1 vote
The best practice we see from other customers is to organize the libraries into sub-folders. Otherwise it looks as if the workspace is “polluted” with a lot of libraries. We are simultaneously looking at improving our library management.
The icon is already the same, now just make the functionality match as well :)1 vote
This is now completed and the Home button goes to the user’s home folder.
Pass arguments from the jobs to a notebook5 votes
It is possible to use the REST API to pass arguments to both notebook and JAR jobs.
Choose the HDFS/Hadoop version for the Spark Cluster.25 votes
This is now available as an option when launching clusters in Databricks 2.15.
There doesn't seem to be a way to get the current date in sql. now() returns a null pointer exception.3 votes
Please use unix_timestamp() instead. We’ll look into now().
Sometimes it just takes ages to get spot clusters and we give up and create on-demand. In that case we need to remember to kill the spot one once it's up.1 vote
Thanks Jaka. Cancellation of create cluster and resize cluster requests are now released as part of Databricks 2.1.
If would be great if standard IPython notebook (as opposed to dbc format) could be imported to DBC. Why change the standard ipynb files? That allows users to test their ideas on any platform and then import to DBC1 vote
This feature is completed. You can now import and export to/from the Ipython notebook format.
Importing IPython nootbooks using drag and drop is confusing (move/ copy etc). Sometimes after dropping it just shows the markup with no err
Importing IPython nootbooks using drag and drop is confusing (move/ copy etc). Sometimes after dropping it just shows the markup with no err.Please improve this1 vote
You should be able to copy/paste cells in notebooks without drag ‘n drop. That’s especially useful for either (a) big cells with a lot of text in them or (b) cells which you want to move long distances.
In "Learning Spark Chapter 11, Machine Learning with MLlib, Examples in Scala", I can not get access to the file "spam.txt",and got an "org.apache.hadoop.mapred.InvalidInputException" error1 vote
This has been fixed. Thanks.
The idea is that when we create a Cluster that we use only for data/model exploration we can select an option to insure the cluster gets shut down out of business hours/days and get relaunched in time for the next day of work
Of course we can do this manually but 99% of the time we would forget47 votes
Happy to announce that the most voted for feature has been released. You can now setup auto-termination on any cluster that is launched. The timeout time for shutdown is configurable.
In the Jobs page, it would be good to have a new column for the status of the most recent job1 vote
This is an excellent suggestion and we have now implemented it. It will be released in about three weeks after QA is done. Thanks Mohan.
It would be great if I could choose to create a Spark cluster on instance types that are optimized for CPU.7 votes
- Don't see your idea?