» Publishers, Monetize your RSS feeds with FeedShow: More infos (Show/Hide Ads)
While InscriptiFact includes artifact images dating to the early 1900's (the artifacts themselves are often 1000's of years old), some of the most amazing images are relatively new RTI images. Understanding how RTI images are created is best done by showing the Melzian Dome used to capture the images.
The dome has 32 computer-controlled LED lights and multiple exposures are taken of the same artifact using different lighting combinations and then merged into a single image file. Using the InscriptiFact viewer, a Java application that can run on any PC or laptop, a user can dynamically change the lighting on the image being viewed. Seeing is believing, so lets take a look at an example.
InscriptiFact provides the ability to compare conventional images along-side RTI images. Illustrated above is an Aramaic tablet from Persepolis, ancient Persia, with a seal impression. The images on the left are visible light and infrared images taken with high-end digital scanning back. The images on the right are versions of an RTI image, one showing the natural color of the object, the other using specular enhancement. Even to the untrained eye, one can clearly understand the power of RTI to bring often better than lifelike detail to ancient artifacts.
While the RTI images are visually the most powerful aspect of InscriptiFact, the real value of the system goes much farther based on the power of the InscriptiFact user interface and underlying Oracle Database. Take for instance the spacial search feature. This feature allows researchers to drag a box on a reference image and retrieve all images that intersect the box.
InscriptiFact is designed to incorporate, integrate and index all existing image data in a quick and intuitive fashion regardless of what repository or collection the artifact (or fragments, thereof) exist in. In the example below, the original table on which an ancient myth was written was broken, and pieces ended up in two different museums. Using InscriptiFact, a researcher can easily retrieve images of all the images for viewing on a single screen.
Not only is InscriptiFact a powerful tool in its own right for anyone from post-grad archeologists to grade school students, its a wonderful example of what is possible through the integration of advanced imaging, advanced database and Java technology, and the Internet to span both space and time. Visit the InscriptiFact web site to learn more.
To put things in perspective, the new Compute Cluster Instances should be compared to other AWS instance types. According to Amazon, a standard AWS EC2 compute unit is normalized to "the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor". The new Compute Cluster Instance is equivalent to 33.5 EC2 compute units. On the surface, that isn't that much more powerful than the previous 26 EC2 compute unit High-Memory Quadruple Extra Large Instance (although the name is certainly simpler). What is different is the Compute Cluster Instance architecture. You can cluster up to 8 Compute Cluster Instances for 64 cores or 268 EC2 compute units. With the Compute Cluster Instance, Amazon provides additional details on the physical implementation, calling out "2 x Intel Xeon X5570, quad-core Nehalem architecture" cores per instance. Perhaps more importantly, while other AWS instance types only specify IO capability as "moderate" or "high", the Compute Cluster Instance comes with "full bisection 10 Gbps bandwidth between instances". While there is a certain value in consistency in advertising compute instances as standard EC2 compute units and IO bandwidth as moderate or high, I applaud Amazon on their increased transparency in calling out both the specific Intel x5570 CPU and the specific 10GbE IO bandwidth of the new Compute Cluster Instances.
So what about Oracle Grid Engine makes it so useful for the new Compute Cluster Instances. AWS already offers customers a broad range of Oracle software on EC2 ranging from Oracle Enterprise Linux to Oracle Database and Oracle WebLogic server and you can download pre-built AWS instances directly from Oracle. Don't take my word for it, read about what joint Oracle/AWS customers like Harvard Medical School are doing with Oracle software on AWS. But back to Oracle Grid Engine. Oracle Grid Engine software is a distributed resource management (DRM) system that manages the distribution of users' workloads to available compute resources. Some of the world's largest supercomputers, like the Sun Constellation System at the Texas Advanced Computing Center use Oracle Grid Engine to schedule jobs across more than 60,000 processing cores. You can now use the same software to schedule jobs across a 64 core AWS Cluster Compute Instance.
Of course, many customers won't use only AWS or only their own compute cluster. A natural evolution of grid to cloud computing is so-called Hybrid Clouds that combine resources across public and private clouds. Oracle Grid Engine already handles that too, enabling you to automatically provision additional resources from the Amazon EC2 service to process peak application workloads, reducing the need to provision datacenter capacity according to peak demand. This so-called cloud bursting feature of Oracle Grid Engine is not new, its just that you can now cloud burst onto a much more powerful AWS Compute Cluster Instance.
One of Oracle's partners who has been doing a lot of work with Oracle Grid Engine in the cloud is Univa UD. I had the opportunity to speak to Univa's new CEO, Gary Tyreman today about how they are helping customers build private and hybrid clouds using Oracle Grid Engine running on top of Oracle VM and Oracle Enterprise Linux. Gary told me Univa has been beta testing the AWS Compute Cluster Instance for several months and that it has worked flawlessly with Oracle Grid Engine and Oracle Enterprise Linux. Gary also noted that they are working with a number of Electronic Design Automation (EDA) customers that need even more powerful virtual servers than the ones available on AWS today. We have several joint customers that are evaluating the new Sun Fire x4800 running Oracle VM as supernodes for running EDA applications in private clouds. To put it in perspective, a single x4800 running Oracle VM can support up to 64 cores and 1 TB of memory. That is as much CPU power and many times the memory of a full 8 node AWS Compute Cluster Instance in a single 5RU server! Now that is a powerful cloud computing platform.
If you want to hear more from Gary about what Univa is doing with some of their cloud computing customers, download his Executive Roundtable video. I'd love to hear from some additional customers who are using Oracle Grid Engine on the new AWS Compute Cluster Instances. Who knows, maybe in the future Amazon will even offer a Super Duper Quadruple Extra Large Cluster Compute Instance based on the a singe 64 core, 1 TB server like the Sun Fire x4800. Meanwhile, you can easily take advantage of both Compute Cluster Instances and the x4800 by building your own hybrid cloud with Oracle Grid Engine.
Tom Cramer, Chief Technology Strategist and Associate Director, Digital Library Systems and Services, Stanford University, started off the morning. One of the interesting points Tom made was how Stanford seamlessly pulls data from five digital systems in the process of archiving student thesis papers. Starting with student and professor information from Stanford's Oracle Peoplesoft campus information system, archive metadata is automatically populated and combined with thesis PDFs, a new library catalog data record is automatically created, and finally, PDFs and associated metadata are automatically crawled and published to the world via Google Books.
Next, Oxford's Neil Jefferies took the discussion a bit deeper and talked about the changing nature of intellectual discourse. While Oxford's collection holds over 250 km shelf-miles of paper books, the library is increasingly working to archive more ephemeral university data sources including websites, social media, and linked data. A consistent theme discussed by Neil and many of the other speakers was the increasing focus on providing not only archive and preservation but also access to data.
On formally to the continent, Laurent DuPlouy and Olivier Rouchon from the French National Library presented on the SPAR Project and CINES Collaboration. They were brave enough to show a live demo of their system, including use of a StorageTek SL8500 Modular Library System.
Back to the UK, Brian Hole from The British Library presented on the LIFE3 project which aims to model the long term preservation lifecycle costs of digitized data. Brian's taking suggestions for improvements in LIFE4 and and I suggested he including in his model the Oracle Secure Backup Cloud module which can securely backup databases to Amazon S3 cloud storage.
After a wonderful Spanish lunch the first panel session of the day started with discussions on Community and Tool Set collaborations.
DuraSpace CEO Sandy Payette presented on the Platform as a Service (PaaS) offering DuraCloud..
Richard Jones presented on the SWORD project on repository interoperability. Read and comment on the SWORD whitepaper.
Jan Reichelt, founder and director of Mendeley reference management software used to organize, share, and discover academic research papers. Mendeley tracks over 28 million research papers including information on most read papers and authors.
David Tarrant of EPrints discussed how EPrints software is used to create and manage repositories.
Finally, Bram van der Werf of Open Planets Foundation described the Open Planets suite of tools for managing digital data.
After the panel presentation, we heard from a series of Oracle speakers. The Oracle Enterprise Content Management Suite 11g is broadly applicable to preservation and archive, capable of archiving over 179 million documents a day as shown in a recent benchmark. Of course, many PASIG customers already use the Sun Storage Archive Manager software along with StorageTek modular library systems and there were updates from Oracle speakers on all of products and more.
The final session included short presentations from a number of Oracle software partners in the archive and preservation space. I definitely learned a lot today about what some of the world's leading digital libraries are doing on the preservation and archive front, and hopefully it was a day well spent for all who attended. If you are not already a PASIG member, be sure to signup now, for this growing Oracle community.
While Oracle has many virtualization technologies, one discussed by both articles is Oracle VM. When considered in combination with the new Sun Fire x4800 server, Oracle VM is a great example of the benefits of Oracle's Software. Hardware. Complete. engineering philosophy. While there are alternative VM technologies in the market, Oracle VM is one of the few that can take full advantage of the capabilities of servers built using Intel's latest 7500 series x86 CPUs like the Sun Fire x4800. Oracle VM can take full advantage of all 64 x86 cores in the Sun Fire x4800 as well as all 1 TB of memory. If you are not using Oracle VM as your virtualization platform, you might want to ask your VM vendor when they will support a full 1 TB of memory and 64 CPU cores like Oracle VM does today (full Oracle VM technical specs can be found here).
You can read all of this issue of Oracle Magazine online, but I also recommend that you signup for your own complimentary subscription. The paper copy is highly recommended for poolside or beach lounging, as well as for those 10 minute periods during takeoffs and landings.
Starting with the entry level Sun Storage 7110, Sun Unified Storage scales up to 576 TB of raw capacity with the newly upgraded Sun Storage 7410. However, unlike other storage offerings that deliver much less usable storage than their raw capacity, Oracle's unified storage offerings often delivery more storage than their raw capacity. Lets take a look at how that's done.
For starters, Oracle's unified storage products are all based on the ZFS file system so you get ZFS's powerful data compression built in at no additional cost. ZFS data compression not only saves valuable storage space, it can actually speed up applications like the MySQL database. Listen to what Oracle customer Don MacAskill from online photo site SmugMug had to say about ZFS data compression and MySQL. Full disclaimer, I'm a happy paying customer of SmugMug storing about 20,000 pictures on the site.
Of course, Oracle's unified storage offers a lot more ways to save storage than simple data compression. While other storage vendors require you to purchase costly software upgrades, often from 3rd party firms, to enable data deduplication, all of Oracle's unified storage servers now offer deduplication built in. So if I upload 10 copies of the same picture to SmugMug they only need to store it once (actually, SmugMug keeps four copies of every unique picture I upload, one of the best availability and preservation policies of any photo site). Or if I'm running 10 copies of the same Oracle VM virtual machine image, deduplication can save me from storing duplicate data.
While SmugMug doesn't put any quotas on how many photos I can upload and store, most enterprise environments enforce user quotas to ensure a single user doesn't use up more storage than expected. Quotas have been around for many years. If you have a 100 TB filesystem, you can allocate 100 users a 1 TB quota and ensure you never run out of space. However, since many users will never use even a fraction of their quota, quotas can actually waste space. Enter so-called "lightweight" quotas. A lightweight quota scheme only allocates space to a user when they require it, allowing you, for instance, to share a 100 TB filesystem with 200 users, each with a 1 TB quota. This of course requires some additional active management as you approach your filesystem capacity to move users to new filesystems as you approach capacity. However, even most so-called lightweight quote systems don't reclaim space when a user deletes files. So if you have 100 users store 1 TB each of data, then they each delete half a TB, the quota system will still show 100 TB allocated. Oracle's unified storage is one of the only systems to implement truly lightweight quote systems. If a user stores 1 TB of data, then deletes half of it, the remaining 500 GB becomes available for other users.
The combination of data compression, data deduplication, and lightweight quotas all help you stretch more value out of a petabyte of data. Of course, those are only some of the ways that Oracle's unified storage helps you simplify your storage.
A petabyte of storage just isn't what it used to be.
One of my favorite automobile companies, BMW, ran an advertising campaign a while back promoting the ability to configure to order your BMW from "a million possible combinations, give or take a nappa leather color option or two". That is actually great when you are selling cars, because at any given time one car is only being driven on one road by one driver, and there are many different types of drivers and roads. For many years, a similar design philosophy has been followed by x86 server vendors. The leading x86 vendors today offer a nearly endless combination of server form factors and options: 1 socket, 2 socket, 4 socket, 8 socket; rack mount, tower, blade; different I/O and memory capacities; and on an on. At one time, that made sense, as each server was typically purchased for a dedicated application and the endless options allowed an IT purchaser to configure and pay for only the features they needed. But unlike cars, the vast majority of x86 servers being purchased today are not serving a single user or running a single application.
With the widespread server consolidation enabled by virtualization technologies and the ever increasing power of multi-core CPUs, the vast majority of an organization's x86 compute demands can today be met with clusters made of up a single x86 server type. Cloud Computing providers like Amazon EC2 have recognized this for years as have High Performance Computing customers like Sandia National Labs. So why have system vendors continued to insist on gratuitously pumping out more and more x86 server models in every shape, size, and color? Well, if all you have to engineer is individual servers, then I guess you get creative. At Oracle, however, our x86 engineers have been busy designing complete x86 clusters to run Oracle and non Oracle workloads, and that has led to some of the design decisions exposed in today's launch.
If you had to build an x86 cluster to handle the broadest possible set of workloads, I'd definitely use the new Sun Fire x4800. Powered by up to eight Intel Xeon 7500 series processors, one terabyte of memory, and eight hot swappable PCIe ExpressModules, this is the most powerful, expandable, and reliable of Oracle’s x86-based servers. Given that the PCIe Express Module standard was first announced by the PCI standards body in 2005, its amazing that five years later we don't see more vendors using this standard to provide hot swappable I/O cards for their servers. Sun first introduced PCIe ExpressModules in our Sun Blade family of blade servers several years ago and the Sun Fire x4800 now continues their use. If your systems vendor isn't using the PCIe Express Module standard for hot swap I/O and only offering proprietary hot-swap solutions, or worse yet, no hot-sway I/O cards, you might want to point them to the 2005 Announcement from the PCI SIG. Of course, if you are designing servers intended to be used as single standalone systems instead of in clusters, then perhaps a choice of bezel color is a more important option.
While I don't have time to discuss all of today's product introductions, one more that I did want to discuss is the new Sun Network 10GbE Switch 72p. Offering 72 10GbE ports in a single 1RU chassis, this switch is definitely designed for building clusters not single servers. While everyone seems to be hawking 10GbE switches these days, most so called "top of rack" switches only support 24 or 48 ports in a 1RU form factor. To replicate the full non-blocking fabric provided by the Sun Network 10GbE Switch 72p would require nine 24 port switches or five 48 port switches, up to 54 additional cables, 1/5 of a rack more space, and significantly more power. When used in conjunction with Oracle's Sun Blade 6000 24p 10GbE NEM, one can easily build non-blocking fabrics of up to 160 nodes or clusters of up to 720 nodes with oversubscription.
So hopefully that gives you a few ideas for building your next x86 cluster. With a lot of vendors, the ideas would stop after the hardware. On the software front, products like Oracle Weblogic 11g Application Server and MySQL Enterprise need no introduction and they require no modification to run on 10GbE clusters. But lets say you are are upgrading an older 2-socket, dual core x86 server to a new 2-socket, six core Sun Fire X4170 M2 Server. Do you really need to upgrade to 10GbE network or will your application run just fine on your existing 1GbE network? For starters, everything else being equal, if your old server ran a single application, with 3x as many cores, your new server, with sufficient memory and I/O, should be able to run at least 3 applications using Oracle VM virtualization software. Of course, one of the benefits of Oracle VM is not only server consolidation, but more flexible management. Even if your core applications run fine with 1 GbE, you could gain significant performance benefits with 10 GbE when you needed to move VMs off the server for planned maintenance, for load balancing, or unplanned server failures (using Oracle VM HA functionality).
Unlike a BMW, which is perhaps best enjoyed by itself on a deserted mountain road, Oracle's new x86 servers are designed to be used together in clusters, along with our high performance 10 GbE and InfiniBand switches, Oracle storage, and Oracle software. Engineered together from application to disk.
Software. Hardware. Complete.
Without actually mentioning ZFS, Henry's analysis points out exactly why the innovative approach of ZFS to data integrity is required in multi-petabyte storage clouds. The key feature of ZFS enabling data integrity is the 256-bit checksum that protects your data. This checksum allows the ZFS self-healing feature to automatically repair corrupted data. ZFS is not new, it was introduced years ago with Solaris 10 and many many petabytes of mission critical data are protected today by ZFS at thousands of companies around the world. When ZFS was first advertised as a future-proof file system, most people were not even dreaming about clouds, but the ZFS designers were certainly thinking about multi-petabyte file systems, that is why they created ZFS with mind-boggling 128-bit scalability.
So thank you Henry for pointing out the quite real limitations of simple geographic replication in large cloud storage environments. If you don't have time to read Henry's brilliant analysis, just ask your cloud storage provider or your own internal IT staff if they are protecting your storage with ZFS. If they ask why, tell them to go ask Henry Newman about it.
Without actually mentioning ZFS, Henry's analysis points out exactly why the innovative approach of ZFS to data integrity is required in multi-petabyte storage clouds. The key feature of ZFS enabling data integrity is the 256-bit checksum that protects your data. This checksum allows the ZFS self-healing feature to automatically repair corrupted data. ZFS is not new, it was introduced years ago with Solaris 10 and many many petabytes of mission critical data are protected today by ZFS at thousands of companies around the world. When ZFS was first advertised as a future-proof file system, most people were not even dreaming about clouds, but the ZFS designers were certainly thinking about multi-petabyte file systems, that is why they created ZFS with mind-boggling 128-bit scalability.
So thank you Henry for pointing out the quite real limitations of simple geographic replication in large cloud storage environments. If you don't have time to read Henry's brilliant analysis, just ask your cloud storage provider or your own internal IT staff if they are protecting your storage with ZFS. If they ask why, tell them to go ask Henry Newman about it.
Of course, I highly recommend the Oracle + Sun Cloud Strategy Webcast. Oracle is doing so many things related to cloud computing its hard to highlight just one, but this webcast does a good job of introducing you to many of Oracle's offerings.
Marten Mickos' twitter page (from which I borrowed some of these links). As CEO of MySQL, Marten helped turn it into one of the most successful open source companies. These days, Oracle is busy helping customers use MySQL in the cloud, and Marten is over working as CEO at cloud startup Eucalyptus which I am sure he will help make equally successful.
Co-founded by another ex Sun employee, Manuel Jaffrin's GetApp.com is a virtual yellow pages for cloud apps, cataloging over 2900 business tools and apps along with user reviews and other helpful information. Before you start writing your own cloud app, its definitely worth a visit to GetApp to see whats already available. As you might expect, GetApp is completely hosted in the cloud, Manuel proudly claims the only computer he owns is his Mac laptop.
There is no shortage of ex Sun folks in the cloud business, Peder Ulander is now Chief Marketing Officer over at Cloud.com, another open source cloud platform.
And yes, while a lot of cloud computing is still marketing, there is a tremendous amount of real work going on in public and private clouds, like NASA's NEBULA Cloud Computing Platform. Its worth noting that NASA's cloud uses a number of Oracle product including the Lustre file system and MySQL database.
Please feel free to comment with your own favorite cloud links.
Of course, I highly recommend the Oracle + Sun Cloud Strategy Webcast. Oracle is doing so many things related to cloud computing its hard to highlight just one, but this webcast does a good job of introducing you to many of Oracle's offerings.
Marten Mickos' twitter page (from which I borrowed some of these links). As CEO of MySQL, Marten helped turn it into one of the most successful open source companies. These days, Oracle is busy helping customers use MySQL in the cloud, and Marten is over working as CEO at cloud startup Eucalyptus which I am sure he will help make equally successful.
Co-founded by another ex Sun employee, Manuel Jaffrin's GetApp.com is a virtual yellow pages for cloud apps, cataloging over 2900 business tools and apps along with user reviews and other helpful information. Before you start writing your own cloud app, its definitely worth a visit to GetApp to see whats already available. As you might expect, GetApp is completely hosted in the cloud, Manuel proudly claims the only computer he owns is his Mac laptop.
There is no shortage of ex Sun folks in the cloud business, Peder Ulander is now Chief Marketing Officer over at Cloud.com, another open source cloud platform.
And yes, while a lot of cloud computing is still marketing, there is a tremendous amount of real work going on in public and private clouds, like NASA's NEBULA Cloud Computing Platform. Its worth noting that NASA's cloud uses a number of Oracle product including the Lustre file system and MySQL database.
Please feel free to comment with your own favorite cloud links.
At the event, you will hear how Oracle Technical Computing provides customers with complete systems: from applications to archival storage, with higher quality and lower TCO. This enables faster time to solution and faster time to market for your business. Using technology proven on some of the world's fastest supercomputers, Oracle Technical Computing addresses the needs of customers in a wide range of industries, from Manufacturing, Oil & Gas, Financial Services and Life Sciences, consolidating Compute with Data Intensive processing across the entire Enterprise.
Register today as spaces are limited and attendance at the HPC Consortium will be invite-only and subject to confirmation.
At the event, you will hear how Oracle Technical Computing provides customers with complete systems: from applications to archival storage, with higher quality and lower TCO. This enables faster time to solution and faster time to market for your business. Using technology proven on some of the world's fastest supercomputers, Oracle Technical Computing addresses the needs of customers in a wide range of industries, from Manufacturing, Oil & Gas, Financial Services and Life Sciences, consolidating Compute with Data Intensive processing across the entire Enterprise.
Register today as spaces are limited and attendance at the HPC Consortium will be invite-only and subject to confirmation.
As in previous years, the Consortium's mission is to provide the high performance computing community with leadership and a forum for information exchange. Network, learn, and share ideas for developing and using Oracle’s Sun compute-intensive and data-intensive technologies to achieve business and research objectives.
Listen to practical applications from the BMW Oracle Racing team speaker and see how this team won back the America’s Cup using high performance computing as one of their strategies.
You will receive details on how to register later this month. Please note that space is limited and attendance at the HPC Consortium will be invite-only and subject to confirmation. We hope you plan to join us in May.
As in previous years, the Consortium's mission is to provide the high performance computing community with leadership and a forum for information exchange. Network, learn, and share ideas for developing and using Oracle’s Sun compute-intensive and data-intensive technologies to achieve business and research objectives.
Listen to practical applications from the BMW Oracle Racing team speaker and see how this team won back the America’s Cup using high performance computing as one of their strategies.
You will receive details on how to register later this month. Please note that space is limited and attendance at the HPC Consortium will be invite-only and subject to confirmation. We hope you plan to join us in May.
When Thinking Machines went bankrupt in 1994, the hardware assets of the company and many of the employees were acquired by Sun Microsystems. What remained of Thinking Machines reformed as a data mining software company and developed the Darwin data mining toolkit. Then in 1999, the data mining business was purchased by Oracle and eventually became ODM.
ODM provides a broad suite of data mining techniques and algorithms to solve many types of business problems. including clssificaiton, regression, attribute importance, association, and feature extraction. There are of course many different data mining software packages in existence that could, for instance, determine the association between frequency of an employee's new blog entries and their number of days traveling in a month. Most of those tools would require you to extract records from a database, input them into the data mining package, run the analysis, and eventually probably store the results back into the database. Therein lies one of the unique advantages of ODM. Much of the data that large enterprises want to mine already exists in a database, so why not put the data mining algorithms into the database too, then you wouldn't have to move the data in order to mine it. That is exactly what Oracle did about a decade ago with ODM, and its been evolving ever since.
Today, perhaps the ultimate data mining platform is Oracle's Exadata Database Machine. Much has been written about Exadata's smart flash cache, its hybrid columnar compression, and its fully redundant QDR InfiniBand networking which, combined, make Exadata both a great data warehouse and a great OLTP platform. Add ODM, and Exadata becomes a great platform for such data mining applications as anomaly analysis for fraud analysis, clustering analysis for life sciences drug discovery, or association analysis for product bundling or in-store placement analysis.
You won't need a PhD in statistics to use ODM, but I would recommend the book Super Crunchers to get you started on imagining the possibilities.
When Thinking Machines went bankrupt in 1994, the hardware assets of the company and many of the employees were acquired by Sun Microsystems. What remained of Thinking Machines reformed as a data mining software company and developed the Darwin data mining toolkit. Then in 1999, the data mining business was purchased by Oracle and eventually became ODM.
ODM provides a broad suite of data mining techniques and algorithms to solve many types of business problems. including clssificaiton, regression, attribute importance, association, and feature extraction. There are of course many different data mining software packages in existence that could, for instance, determine the association between frequency of an employee's new blog entries and their number of days traveling in a month. Most of those tools would require you to extract records from a database, input them into the data mining package, run the analysis, and eventually probably store the results back into the database. Therein lies one of the unique advantages of ODM. Much of the data that large enterprises want to mine already exists in a database, so why not put the data mining algorithms into the database too, then you wouldn't have to move the data in order to mine it. That is exactly what Oracle did about a decade ago with ODM, and its been evolving ever since.
Today, perhaps the ultimate data mining platform is Oracle's Exadata Database Machine. Much has been written about Exadata's smart flash cache, its hybrid columnar compression, and its fully redundant QDR InfiniBand networking which, combined, make Exadata both a great data warehouse and a great OLTP platform. Add ODM, and Exadata becomes a great platform for such data mining applications as anomaly analysis for fraud analysis, clustering analysis for life sciences drug discovery, or association analysis for product bundling or in-store placement analysis.
You won't need a PhD in statistics to use ODM, but I would recommend the book Super Crunchers to get you started on imagining the possibilities.
The Oracle events team has already been working with the ISC team and if you check the ISC Sponsors page you can see it has even been updated with the new Sun Oracle logo.
One note, the ISC10 conference will be held two weeks earlier than the traditional mid-June date, and thus the HPC Consortium is also moving to May 29th & 30th. So save the date and stay tuned for the registration site which will be coming soon.
The Oracle events team has already been working with the ISC team and if you check the ISC Sponsors page you can see it has even been updated with the new Sun Oracle logo.
One note, the ISC10 conference will be held two weeks earlier than the traditional mid-June date, and thus the HPC Consortium is also moving to May 29th & 30th. So save the date and stay tuned for the registration site which will be coming soon.







