Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Comprehensive-database-of-Minerals - MaRDI portal

Comprehensive-database-of-Minerals

From MaRDI portal
Dataset:6036459



OpenML43356MaRDI QIDQ6036459

OpenML dataset with id 43356

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/22102181/Comprehensive-database-of-Minerals.arff

Upload date: 23 March 2022



Dataset Characteristics

Number of features: 140 (numeric: 139, symbolic: 0 and in total binary: 0 )
Number of instances: 3,112
Number of instances with missing values: 0
Number of missing values: 0

This dataset is the collection of 3112 minerals, their chemical compositions, crystal structure, physical and optical properties. The properties that are included in this database are the Crystal structure, Mohs Hardness, Refractive Index, Optical axes, Optical Dispersion, Molar Volume, Molar, Mass, Specific Gravity, and Calculated Density. Introduction The term dielectric is applied to a class of materials - usually solids - that are poor conductors of electricity. Dielectrics are of significant technological and industrial importance, being essential functional components of almost all electronic devices. For most of these applications, they are required to be mechanically tough and thermally robust. The defining physical attribute of a dielectric is electric polarizability which is the tendency for charges to be non-uniformly distributed across a chemical bond. Most dielectrics contain dipoles due to their ionic bonds or covalent bonds with strong ionic nature. At a macroscopic scale, this implies that an external electric field can interact with these charges and result in various optical and electric phenomena. Optically, dielectrics can be transparent, opaque, or vitreous. They can also be isotropic, biaxial, or fully anisotropic. The luster of gem minerals such as emerald, sapphire, and ruby is due to their high refractive index which causes white light to be split into its components. The presence of two refractive indices in a material can result in an incident beam being split into two rays that interfere with each other. This common phenomenon is called Birefringence. These effects are made use of in many commercially important applications such as transparent conductive oxides, liquid crystal displays, medical diagnostics, stress sensing, light modulation, etc. As an example, transparent conducting oxides (TCO) are derived from dielectrics by doping oxides with impurity atoms. TCOs do not absorb light in the visible spectrum rendering them transparent and are also conductors of charge. The most important application of TCOs is as the top electrode of solar cells where they allow light to fall on a semiconducting layer while capturing the released hole/electron to generate current. Airplane windshields have a thin coating of a TCO material on them that is used to generate heat by passing a current. This is necessary to keep the glass defrosted allowing the pilot visibility to navigate. Other applications of TCOs is as substrates in electronics, flexible displays, high definition TVs, and the screens of mobile smart devices. The figure for merit for optical phenomena is the refractive index, which is defined as the ratio of the speed of light in the medium to the speed of light in vavacuum. Provenance of Data The list of minerals with individual pages in Wikipedia is given at: https://en.wikipedia.org/wiki/List_of_minerals. The get method of the requests library is used to retrieve this page and the content is parsed using BeautifulSoup a python library specifically engineered for parsing html and lxml content. The URLs for all the minerals given in this page is extracted using their href attribute and are stored in a dictionary, along with the mineral name. Each of the webpages has textual information on the mineral (origin, etymology, variety, history etc.), images (cleavages, and other data) as well as an Infobox on the right that tabulates some common mineral properties such as category, formula, strunz classification, crystal structure, unit cell, Mohs hardness, color, cleavage, fracture, luster, diaphaneity, specific gravity, optical properties a and refractive index. The soup object for the page is retrieved and the table element with class name infobox is extracted. The specified row heading and row data are then read into a dictionary which is wrapped in a class object. A class method writes this data into a csv file while another method writes the text from the webpage into a text file. The American Mineralogist Crystal Structure Database at http://rruff.geo.arizona.edu/AMS/amcsd.php has a list of over 4000 minerals with their cif files. The name and the URL of all these minerals are found at http://rruff.geo.arizona.edu/AMS. From here, each mineral name and the corresponding URL is extracted using the approach outlined above. Accessing each page, we find the crystallographic information of the mineral. The a,b,c edge lengths and alpha, beta, gamma - unit cell angles are given at the top followed by a list of all atoms and their x,y,z positions. The header is extracted and stored in a pandas dataframe while the atomic species and their positions are saved into a separate CSV file. This is repeated for all the 4000 minerals. Before inclusion into the machine learning stage of this study, each of these cif files are read and parsed into a vector with each cell corresponding to an element of the periodic table and the number of atoms of the element in the formula is counted as the cell value. This is detailed further in the data processing part of the project. Compared to other properties, dispersion of minerals has been hard to find. Dispersion values of 60 minerals found at: http://gemologyproject.com/wiki. The chemical formula, molar mass, molar volume, and calculated density are available for all minerals. The availability of other properties vary. Chemical Formula The chemical formula has been parsed so that the number of each element has been separated tabulated. For example, the mineral Quartz has the formula 'SiO2' - so that the corresponding entry for the column 'Silicon' is 1 and the entry for 'Oxygen' is 2. The entries for all the other elements are 0. In this way, the chemical formula for each mineral is converted into a vector where each column corresponds to an element in the periodic table and the value corresponds to the number of atoms of the element in a formula unit of the mineral. In addition to the pure elements, ionic species such as carbonate, phosphate, nitrate, cyanide, hydrated water, etc are also counted separately. Molar Mass The molar mass of the mineral is calculated by adding together the mass of each atom in a mole of the mineral. Molar mass = Summation( no of atoms * mass of each atom) Molar Volume The molar volume of the mineral is calculated by adding together the volume of each atom in a mole of the mineral. Molar volume = Summation( no of atoms * volume of each atom) Refractive Index The refractive index of the mineral is defined as the ratio of the speed of light in the mineral to the speed of light in free space. This is a function of the frequency of light. The RI of blue light is not the same as the RI of red light in the same mineral. This variation is measured by 'dispersion'. Mohs Hardness Mohs hardness is a qualitative measure for the hardness of a mineral that is frequently used by the geologist. Diamond (hardest mineral) is given the highest value of 10 and talc (softest mineral) is given the value of 1. A mineral that can scratch a second mineral has a higher Mohs hardness. In this way, all the minerals can be ranked on a relative scale of hardness. It is not exactly clear what physical parameter is represented by the Mohs Hardness. Several absolute scales for hardness such as toughness, yield strength, etc. are known from the mechanics of materials, however, none of them seem to correspond exactly to Mohs Hardness. However, this remains a very intuitive way to understand the physical property of a material.




This page was built for dataset: Comprehensive-database-of-Minerals