First point this: http://www.iucr.org/resources/cif/spec/ ... /cifsyntax
Everything in this should be possible, maybe not everything implemented at first.
For example:
29. Data names may not exceed 75 characters in length.
There is no point to create a varchar column of 256 characters and can lead to future problems.
_publ_author_name is a data block. It does not look very difficult to handle, but think about it.
You don't know the number of names, this is therefor a 1:N relationship. You need two tables for this.
However, in this case the authors table is not normalised. Duplicates records will occured as an author can be present in many cif files. It is now a N:M relationship, you need 3 tables:
One table for the cif file
One table for the authors
One table for the linking between the two (2 columns defined as foreign key of each primary key of the two tables above)
_publ_section_title
I did not find if this one can be multi lines or not.
; ; construct allow multi lines construct. The difference between varchar and text data is important for indexing purpose.
These are more or less trivial, just a column in the cif main table (_journal_name_full, _diffrn_measurement_method, _diffrn_radiation* could be normalised, ie more tables) :
- Code: Select all
_journal_issue 9
_journal_name_full 'Acta Crystallographica, Section C'
_journal_page_first 1073
_journal_page_last 1074
_journal_volume 56
_journal_year 2000
_chemical_formula_moiety '(C5 H16 N2 )[AlHP2 O8 ]'
_chemical_formula_sum 'C5 H17 Al N2 O8 P2'
_chemical_formula_weight 322.13
[...]
[...]
_audit_creation_method SHELXL-97
_cell_angle_alpha 90.00
_cell_angle_beta 95.1470(10)
_cell_angle_gamma 90.00
_cell_formula_units_Z 4
_cell_length_a 7.8783(2)
_cell_length_b 10.46890(10)
_cell_length_c 16.0680(4)
_cell_measurement_reflns_used 5007
_cell_measurement_temperature 296(2)
_cell_measurement_theta_max 29.83
_cell_measurement_theta_min 2.32
_cell_volume 1319.90(5)
_computing_cell_refinement SMART
_computing_data_collection 'SMART (Siemens, 1996a)'
_computing_data_reduction 'SHELXTL96 (Siemens, 1996b)'
_computing_molecular_graphics 'DIAMOND (Bergerhoff, 1996)'
_computing_publication_material SHELXTL
_computing_structure_refinement 'SHELXL93 (Sheldrick, 1993)'
_computing_structure_solution 'SHELXS86 (Sheldrick, 1990)'
_diffrn_ambient_temperature 296(2)
_diffrn_measurement_device 'Siemens SMART diffractometer'
_diffrn_measurement_method '\w scans'
_diffrn_radiation_monochromator graphite
_diffrn_radiation_source 'fine-focus sealed tube'
_diffrn_radiation_type MoK\a
_diffrn_radiation_wavelength .71073
_diffrn_reflns_av_R_equivalents .0383
_diffrn_reflns_av_sigmaI/netI .0532
_diffrn_reflns_limit_h_max 10
_diffrn_reflns_limit_h_min -10
_diffrn_reflns_limit_k_max 13
_diffrn_reflns_limit_k_min -14
_diffrn_reflns_limit_l_max 9
_diffrn_reflns_limit_l_min -21
_diffrn_reflns_number 8939
_diffrn_reflns_theta_max 29.83
_diffrn_reflns_theta_min 2.32
_exptl_absorpt_coefficient_mu .429
_exptl_absorpt_correction_T_max .978
_exptl_absorpt_correction_T_min .844
_exptl_absorpt_correction_type semi-empirical
_exptl_absorpt_process_details 'SADABS (Sheldrick, 1996)'
_exptl_crystal_colour colorless
_exptl_crystal_density_diffrn 1.621
_exptl_crystal_density_meas 'not measured'
_exptl_crystal_description parallelepiped
_exptl_crystal_F_000 672
_exptl_crystal_size_max .12
_exptl_crystal_size_mid .06
_exptl_crystal_size_min .05
_refine_diff_density_max 1.357
_refine_diff_density_min -.604
_refine_ls_extinction_coef .013(8)
_refine_ls_extinction_method 'SHELXL93 (Sheldrick, 1993)'
_refine_ls_goodness_of_fit_all 1.055
_refine_ls_goodness_of_fit_ref 1.080
_refine_ls_hydrogen_treatment constr
_refine_ls_matrix_type full
_refine_ls_number_parameters 167
_refine_ls_number_reflns 2521
_refine_ls_number_restraints 4
_refine_ls_restrained_S_all 1.370
_refine_ls_restrained_S_obs 1.096
_refine_ls_R_factor_all .1073
_refine_ls_R_factor_gt .0584
_refine_ls_shift/esd_mean .000
_refine_ls_shift/su_max <0.001
_refine_ls_structure_factor_coef Fsqd
_refine_ls_weighting_scheme
'calc w = 1/[\s^2^(Fo^2^)+(0.0573P)^2^+3.0698P] where P=(Fo^2^+2Fc^2^)/3'
_refine_ls_wR_factor_all .2069
_refine_ls_wR_factor_ref .1362
_reflns_number_gt 1901
_reflns_number_total 3421
_reflns_threshold_expression I>2\s(I)
Remark: std deviation 7.8783(2) is tricky as it is not a number. It can't be stored as is.
These ones:
_symmetry_cell_setting Monoclinic
_symmetry_space_group_name_H-M P2(1)/n
Should be used with a dictionnary. a table cell_setting and a table space group name. It enforces integrity and consistency.
- Code: Select all
_symmetry_equiv_pos_as_xyz
'x, y, z'
'-x+1/2, y+1/2, -z+1/2'
'-x, -y, -z'
'x-1/2, -y-1/2, z-1/2'
1:N relationship. possibly N:M. Is there a finite number of symmetry operator?
SGBDRs are complicated, it has to be done properly. Just to handle genealogical data, I am using about 10 tables for data and 25 relation tables for N:M relationships.
