Difference between revisions of "Main Page/Endnote"

From phurvitz
Jump to: navigation, search
 
m (Protected "Main Page/Endnote" [edit=sysop:move=sysop] [cascading])
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
__FORCETOC__
 
__FORCETOC__
 +
==Background==
 +
Term lists in Endnote v7 function as abbreviation lists for bibliographies.  Some journals require full journal names in references, and some require abbreviations (and some that require abbreviations use dots, and others don't). E.g., one of these might be the format your journal is looking for:
 +
*''Abdominal Imaging''
 +
*''Abdom Imaging''
 +
*''Abdom. Imaging''
 +
Endnote comes out of the box with rather sparse term lists.  This tutorial will describe how to make standardized and more comprehensive term lists.
 +
 +
==Prerequisites==
 +
# Endnote installed on the PC
 +
# [http://www.cpan.org/ports/#win32 perl] installed on the PC (if you want to recreate the process), or  [[#To_update_the_term_list_in_Endnote | the term list I generated]]
 +
 
==Fixing abbreviated journal names==
 
==Fixing abbreviated journal names==
Some of  my references had the full journal name and others had abbreviations.  This causes problems when you need a bibliography in a specific format, e.g., with full names you will get the abbreviations for those that do not have full names in the reference; e.g., if you want abbreviations, EndNote will ignore the term lists for those that are stored with abbreviated names.
+
Some of  my references had the full journal name and others had abbreviations.  This caused problems when I needed a bibliography in a specific format.
 +
 
 +
I got a copy of term lists from the [http://www.library.uq.edu.au/faqs/endnote/journal_terms.html University of Queensland]. These term lists contain tab-separated full, abbreviated, and dot-abbreviated journal names. However, these term lists were haphazardly arranged with respect to the abbreviations (that is, some of the journals in the 2nd column had dots, and some in the 3rd column had dots). Therefore, the lists required some rearranging. Luckily, perl is very handy at string manipulations.
 +
 
 +
===A perl script for creating a sed script to fix the term list===
 +
Creates the standardized column abbreviated term list:
 +
<pre>
 +
#! /usr/bin/perl -W
 +
use strict;
 +
use warnings;
 +
 
 +
# construct sed scripts for fixing endnote
 +
 
 +
# args
 +
if ($#ARGV == -1) {
 +
  print "Usage: $0 <infile>\n";
 +
  exit;
 +
}
  
I got a copy of term lists from the [http://www.library.uq.edu.au/faqs/endnote/journal_terms.html University of Queensland].
+
# does infile exist?
 +
my $infile = $ARGV[0];
 +
if (! -e $infile) {
 +
  print "$infile does not exist\n";
 +
  exit;
 +
}
 +
 
 +
# output file
 +
(my $outfile = $infile) =~ s/\.txt/_fixed\.txt/;
 +
open (OUTTERM, ">$outfile");
 +
 
 +
# read input file
 +
open (INFILE, "<$infile") or die "cannot open";
 +
while (my $record = <INFILE>) {
 +
    chomp $record;
 +
    # split into pieces with tabs
 +
    my @list = split(/\t/, $record);
 +
    # how many substrings?
 +
    my $count = @list;
 +
    # extract the first 2 substrings
 +
    my $ss1 = $list[0];
 +
    my $ss2 = $list[1];
 +
    # if 3 substrings
 +
    if ($count == 3) {
 +
        # get the 3rd substring
 +
my $ss3 = $list[2];
 +
# handle dots
 +
$ss3 =~ s/\./\./g;
 +
$ss2 =~ s/\./\./g;
 +
# if neither substring has a dot
 +
if (($ss2 !~ m/\./) && ($ss3 !~ m/\./)) {
 +
    print OUTTERM "$ss1\t$ss3\t$ss2\n";
 +
}
 +
        # if the 2nd ss has a dot and the 3rd does not
 +
if (($ss2 =~ m/\./) && ($ss3 !~ m/\./)) {
 +
    print OUTTERM "$ss1\t$ss3\t$ss2\n";
 +
}
 +
        # if the 3nd ss has a dot and the 2rd does not
 +
    if (($ss2 !~ m/\./) && ($ss3 =~ m/\./)) {
 +
    print OUTTERM "$ss1\t$ss2\t$ss3\n";
 +
}
 +
        # if both ss have dots
 +
    if (($ss2 =~ m/\./) && ($ss3 =~ m/\./)) {
 +
    print OUTTERM "$ss1\t$ss2\t$ss3\n";
 +
}
 +
    }
 +
}
 +
close INFILE;
 +
close OUTTERM;
 +
print "created $outfile\n";
 +
</pre>
 +
 
 +
which can be run as a command, e.g., <tt>'''construct_endnote_fix_sed.pl pmh_phd_journals_term.txt'''</tt>
  
 
===To update the term list in Endnote===
 
===To update the term list in Endnote===
 
 
# Open the library.
 
# Open the library.
 
# Click on Tools on the menu bar and choose Define Term Lists
 
# Click on Tools on the menu bar and choose Define Term Lists
Line 13: Line 92:
 
# Delete (right-click/Cut).  ''Do not remove the term list altogehter''; there does not seem to be a way to add the journals term list back with all the fields in EndNote 7, contrary to the documentation.
 
# Delete (right-click/Cut).  ''Do not remove the term list altogehter''; there does not seem to be a way to add the journals term list back with all the fields in EndNote 7, contrary to the documentation.
 
# Click on the Import List button.
 
# Click on the Import List button.
# Select the text file to be imported and click on the Open button (I used the Medical, Biosciences, and Chemistry lists).
+
# Select the text file to be imported and click on the Open button (I used the Medical, Biosciences, and Chemistry lists that were rearranged with the perl script above). You can download [http://gis.washington.edu/phurvitz/endnote/pmh_phd_journals_term_fixed.txt this list] if you want and use that in the process described here.
  
<font color="grey">
+
===To create a series of XML files for import to EndNote===
===To fix the list (deprecated, as there is no standard for which abbreviation has the dots)===
+
'''NOT NECESSARY'''
# Export the term list (click the ''Export List'' button).  I saved this as '''pmh_phd_journals_term.txt'''.
+
Once Endnote has a term list containing the proper abbreviations, when a reference is added to a library, if the term list includes the abbreviation for that journal, it is possible to use any of the abbreviation styles in the term list.
# Separate the list into columns, one for the full title, and one each for the abbreviated titles.
+
#: <pre>cut -f1 pmh_phd_journals_term.txt > full.txt</pre>
 
#: <pre>cut -f2 pmh_phd_journals_term.txt > abb1.txt</pre>
 
#: <pre>cut -f3 pmh_phd_journals_term.txt  | sed "s/\./\\./g" > abb2.txt</pre>
 
# Join the columns back together with some sed to pre-format the sed scripts for the master edit process:
 
#:
 
</font>
 
 
 
===A perl script for creating a sed script to fix the term list===
 
also creates a standardized term list
 
 
<pre>
 
<pre>
 
#! /usr/bin/perl -w
 
#! /usr/bin/perl -w
 
use warnings;
 
use warnings;
#local $/=undef;
 
  
 
# read the sed scripts to fix the endnote library
 
# read the sed scripts to fix the endnote library
Line 39: Line 108:
 
my $lines = <LIBRARY>;
 
my $lines = <LIBRARY>;
 
close (LIBRARY);
 
close (LIBRARY);
 +
 
# remove the header and trailer stuff
 
# remove the header and trailer stuff
 
$lines =~ s/<XML><RECORDS>//;
 
$lines =~ s/<XML><RECORDS>//;
 
$lines =~ s/<\/RECORDS><\/XML>//;
 
$lines =~ s/<\/RECORDS><\/XML>//;
$lines =~ s/\/RECORD>/\/RECORD>\n/g;
 
 
# make an array of records
 
@linelist = split(/\n/, $lines);
 
$linecount = @linelist;
 
print "$linecount\n";
 
 
# array of lists
 
my @group = ();
 
  
 
# open and start reading the list of journal names
 
# open and start reading the list of journal names
Line 69: Line 130:
  
 
# an array of lines
 
# an array of lines
 +
$lines =~ s/\/RECORD>/\/RECORD>\n/g;
 
@linelist = split(/\n/, $lines);
 
@linelist = split(/\n/, $lines);
  
Line 78: Line 140:
 
     $count++;
 
     $count++;
 
     # substitute
 
     # substitute
     $line =~ s/<\/w+>/<\/foobar1>\n\n/;
+
     #$line =~ s/<\/(\w+)>/<\/$1>\n/g;
 +
    $line =~ s/<styles><\/styles>//g;
 
     print OUT $line;
 
     print OUT $line;
 
     $f = $count/100;
 
     $f = $count/100;
Line 90: Line 153:
 
}
 
}
 
close (OUT);
 
close (OUT);
 
 
 
</pre>
 
</pre>

Latest revision as of 16:51, 13 October 2009

Background

Term lists in Endnote v7 function as abbreviation lists for bibliographies. Some journals require full journal names in references, and some require abbreviations (and some that require abbreviations use dots, and others don't). E.g., one of these might be the format your journal is looking for:

  • Abdominal Imaging
  • Abdom Imaging
  • Abdom. Imaging

Endnote comes out of the box with rather sparse term lists. This tutorial will describe how to make standardized and more comprehensive term lists.

Prerequisites

  1. Endnote installed on the PC
  2. perl installed on the PC (if you want to recreate the process), or the term list I generated

Fixing abbreviated journal names

Some of my references had the full journal name and others had abbreviations. This caused problems when I needed a bibliography in a specific format.

I got a copy of term lists from the University of Queensland. These term lists contain tab-separated full, abbreviated, and dot-abbreviated journal names. However, these term lists were haphazardly arranged with respect to the abbreviations (that is, some of the journals in the 2nd column had dots, and some in the 3rd column had dots). Therefore, the lists required some rearranging. Luckily, perl is very handy at string manipulations.

A perl script for creating a sed script to fix the term list

Creates the standardized column abbreviated term list:

#! /usr/bin/perl -W
use strict;
use warnings;

# construct sed scripts for fixing endnote

# args
if ($#ARGV == -1) {
  print "Usage: $0 <infile>\n";
  exit;
}

# does infile exist?
my $infile = $ARGV[0];
if (! -e $infile) {
  print "$infile does not exist\n";
  exit;
}

# output file
(my $outfile = $infile) =~ s/\.txt/_fixed\.txt/;
open (OUTTERM, ">$outfile");

# read input file
open (INFILE, "<$infile") or die "cannot open";
while (my $record = <INFILE>) {
    chomp $record;
    # split into pieces with tabs
    my @list = split(/\t/, $record);
    # how many substrings?
    my $count = @list;
    # extract the first 2 substrings
    my $ss1 = $list[0];
    my $ss2 = $list[1];
    # if 3 substrings
    if ($count == 3) {
        # get the 3rd substring
	my $ss3 = $list[2];
	# handle dots
	$ss3 =~ s/\./\./g;
	$ss2 =~ s/\./\./g;
	# if neither substring has a dot
	if (($ss2 !~ m/\./) && ($ss3 !~ m/\./)) {
	    print OUTTERM "$ss1\t$ss3\t$ss2\n";
	}
        # if the 2nd ss has a dot and the 3rd does not
	if (($ss2 =~ m/\./) && ($ss3 !~ m/\./)) {
	    print OUTTERM "$ss1\t$ss3\t$ss2\n";
	}
        # if the 3nd ss has a dot and the 2rd does not
    	if (($ss2 !~ m/\./) && ($ss3 =~ m/\./)) {
	    print OUTTERM "$ss1\t$ss2\t$ss3\n";
	}
        # if both ss have dots
    	if (($ss2 =~ m/\./) && ($ss3 =~ m/\./)) {
	    print OUTTERM "$ss1\t$ss2\t$ss3\n";
	}
    }
}
close INFILE;
close OUTTERM;
print "created $outfile\n";

which can be run as a command, e.g., construct_endnote_fix_sed.pl pmh_phd_journals_term.txt

To update the term list in Endnote

  1. Open the library.
  2. Click on Tools on the menu bar and choose Define Term Lists
  3. Highlight Journals
  4. Select all journals.
  5. Delete (right-click/Cut). Do not remove the term list altogehter; there does not seem to be a way to add the journals term list back with all the fields in EndNote 7, contrary to the documentation.
  6. Click on the Import List button.
  7. Select the text file to be imported and click on the Open button (I used the Medical, Biosciences, and Chemistry lists that were rearranged with the perl script above). You can download this list if you want and use that in the process described here.

To create a series of XML files for import to EndNote

NOT NECESSARY Once Endnote has a term list containing the proper abbreviations, when a reference is added to a library, if the term list includes the abbreviation for that journal, it is possible to use any of the abbreviation styles in the term list.

#! /usr/bin/perl -w
use warnings;

# read the sed scripts to fix the endnote library

# open the xml library and read it all in
open (LIBRARY, "<pmh_phd.xml") or die "cannot open";
my $lines = <LIBRARY>;
close (LIBRARY);

# remove the header and trailer stuff
$lines =~ s/<XML><RECORDS>//;
$lines =~ s/<\/RECORDS><\/XML>//;

# open and start reading the list of journal names
open (SED, "<sed_nodots.sed") or die "cannot open sed_dots.sed";
$count = 0;
while (<SED>) {
    $count++;
    print "$count\n";
    $sedstr = $_;
    chomp $sedstr;
    @substr = split(/\|/, $sedstr);
    $sub1 = $substr[1];
    $sub2 = $substr[2];
    $sub2 =~ s/\&/\&amp\;/g;
    $lines =~ s/<SECONDARY_TITLE>$sub1<\/SECONDARY_TITLE>/<SECONDARY_TITLE>$sub2<\/SECONDARY_TITLE>/g;
}
close (SED);

# an array of lines
$lines =~ s/\/RECORD>/\/RECORD>\n/g;
@linelist = split(/\n/, $lines);

# process....
$outfile = "xml_0.xml";
open (OUT, ">$outfile");
$count = 0;
foreach $line (@linelist) {
    $count++;
    # substitute
    #$line =~ s/<\/(\w+)>/<\/$1>\n/g;
    $line =~ s/<styles><\/styles>//g;
    print OUT $line;
    $f = $count/100;
    $mod = $count%10;
    $r = (int($count/10))%10;
    if (($mod == 0) && ($r == 0)) {
       close (OUT);
       $outfile = join("\.", join("_", "xml", $f), "xml");
       open (OUT, ">$outfile");
    }
}
close (OUT);