Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sed compatibility problems #2225

Closed
danpovey opened this issue Feb 16, 2018 · 10 comments
Closed

Sed compatibility problems #2225

danpovey opened this issue Feb 16, 2018 · 10 comments
Labels
stale Stale bot on the loose

Comments

@danpovey
Copy link
Contributor

I hope someone can find the time to help with this.
This is a sed compatibility issue that affects Macs and other BSD. Commands like the following:
sed -i'' 's/#1$//g' $lout/lexicon.txt
wil fail on mac, due to the command being interpreted as an extension argument to the -i option. (The '' is not seen by sed). I have looked into this and there appears to be no way to get this to work in a compatible way without adding an extension. I suggest to do this as, for instance:

sed -i.bak 's/#1$//g' $lout/lexicon.txt
rm $lout/lexicon.txt.bak 

... or you can remove the backup file later in the script, or perhaps better, explicitly reference the original location so it's not in-place at all.

There are quite a few sed -ir commands. I don't know how those arose; 'r' is a strange backup extension and the letter 'r' is certainly not interpreted as an option (and anyway that option doesn't exist on mac). I think those r's should be changed to .bak and the backup removed if posible.

I'd like to warn people that sed should be considered HIGHLY SUSPECT from a compatibility point of view, especially if you are using command-line switches. perl is preferable if possible.

See below for affected locations, but please run this yourself with larger maxdepth in case there are more.

cd egs/wsj/s5
find ../.. -maxdepth 6 -name '*.sh' -exec grep 'sed -i' {} \; -print 2>/dev/null
     sed -ir "s!$p1!$p2!" $dir/segments
../../ami/s5/local/ami_ihm_scoring_data_prep.sh
     sed -ir "s:$p1:$p2:" $tmpdir/segments
../../ami/s5/local/ami_mdm_scoring_data_prep.sh
     sed -ir "s:$p1:$p2:" $tmpdir/segments
../../ami/s5/local/ami_sdm_scoring_data_prep.sh
     sed -ir "s!$p1!$p2!" $dir/segments
../../ami/s5b/local/ami_ihm_scoring_data_prep.sh
     sed -ir "s:$p1:$p2:" $tmpdir/segments
../../ami/s5b/local/ami_mdm_scoring_data_prep.sh
     sed -ir "s:$p1:$p2:" $tmpdir/segments
../../ami/s5b/local/ami_sdm_scoring_data_prep.sh
  sed -i "s/_non_normalized//g" $file
../../aspire/s5/local/multi_condition/prepare_impulses_noises.sh
sed -i'' 's/#1$//g' $lout/lexicon.txt
sed -i'' 's/#1$//g' $lout/lexiconp.txt
../../babel/s5c/local/syllab/generate_syllable_lang.sh
sed -i'' 's/#1$//g' $lout/lexicon.txt
sed -i'' 's/#1$//g' $lout/lexiconp.txt
../../babel/s5d/local/syllab/generate_phone_lang.sh
sed -i'' 's/#1$//g' $lout/lexicon.txt
sed -i'' 's/#1$//g' $lout/lexiconp.txt
../../babel/s5d/local/syllab/generate_syllable_lang.sh
sed -i "/\[$w\]/d" $tmpdir/lexicon.3
../../callhome_egyptian/s5/local/callhome_prepare_dict.sh
#sed -i "s:\s.*_fsp-([AB]): \1:g" data/dev/stm
#ls exp/tri5a/decode_dev/score_*/dev.ctm | xargs -I {} sed -i -r 's:fsp\s1\s:fsp A :g' {}
#ls exp/tri5a/decode_dev/score_*/dev.ctm | xargs -I {} sed -i -r 's:fsp\s2\s:fsp B :g' {}
../../callhome_egyptian/s5/local/ctm.sh
#sed -i "s:\s.*_fsp-([AB]): \1:g" data/dev/stm
#ls exp/tri5a/decode_dev/score_*/dev.ctm | xargs -I {} sed -i -r 's:fsp\s1\s:fsp A :g' {}
#ls exp/tri5a/decode_dev/score_*/dev.ctm | xargs -I {} sed -i -r 's:fsp\s2\s:fsp B :g' {}
../../fisher_callhome_spanish/s5/local/ctm.sh
    sed -i "/\[$w\]/d" $tmpdir/lexicon.2
../../fisher_callhome_spanish/s5/local/fsp_prepare_dict.sh
sed -i '1i<UNK> SIL' $dir/lexicon.txt # insert word <UNK> with phone sil at the begining of the dictionary
../../gale_arabic/s5b/local/gale_prep_grapheme_dict.sh
 sed -i '1i<UNK> SIL' $dir/lexicon.txt
../../gale_arabic/s5/local/gale_prep_dict.sh

@langep
Copy link
Contributor

langep commented Mar 29, 2018

We will still run into problems when using sed -r e.g. ls exp/tri5a/decode_dev/score_*/dev.ctm | xargs -I {} sed -i -r 's:fsp\s1\s:fsp A :g' {}. BSD sed uses -E instead of -r.

I think we can replace all sed -i -r with perl -pi -e. This might be slightly slower but probably does not matter in these scripts.

sed -i and sed -i'' would need to either use backups with additional cleanup step or the regexes would need to be converted to extended regexes. In the sed -ir case, someone with more experience should probably check if -i -r was intended.

An alternative option to converting anything is to install gnu-sed during the initial installation of dependencies. This could be done with:

brew install gnu-sed --with-default-names
brew link gnu-sed

This would require the user to add /usr/local/bin in front of their PATH but most brew users probably have this already. Furthermore, this would potentially break other scripts the user wrote for BSD sed if they are not careful about setting up their PATH.

@danpovey
Copy link
Contributor Author

danpovey commented Mar 29, 2018 via email

@johnjosephmorgan
Copy link
Contributor

I found sed compatibility problems in the following file:
egs/heroico/s5/local/prepare_data.sh
I suggest replacing:
sed -e "s/\r//"
with
tr -d "\r"

@danpovey
Copy link
Contributor Author

danpovey commented Aug 1, 2018 via email

@johnjosephmorgan
Copy link
Contributor

johnjosephmorgan commented Aug 1, 2018 via email

@danpovey
Copy link
Contributor Author

danpovey commented Jan 4, 2019

@desh2608 do you think you might have time to work on this issues? You can ask vimal for help.

@desh2608
Copy link
Contributor

desh2608 commented Jan 5, 2019

I understand that all uses of sed -i -r are to be replaced with perl -pi -e, as per @langep 's comment above. I'll start with making this edit.

@danpovey
Copy link
Contributor Author

danpovey commented Jan 5, 2019

OK, but don't take his word for it-- you'll need to figure out a way to test that it does the right thing (even if it doesn't involve fully running the examples).

@stale
Copy link

stale bot commented Jun 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale bot on the loose label Jun 19, 2020
@stale
Copy link

stale bot commented Jul 19, 2020

This issue has been automatically closed by a bot strictly because of inactivity. This does not mean that we think that this issue is not important! If you believe it has been closed hastily, add a comment to the issue and mention @kkm000, and I'll gladly reopen it.

@stale stale bot closed this as completed Jul 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Stale bot on the loose
Projects
None yet
Development

No branches or pull requests

4 participants