Skip to content

Commit

Permalink
Explicitly allow certain file extensions
Browse files Browse the repository at this point in the history
The MIME type of `.org` files is read as `application/vnd.lotus-organizer`
sometimes. That's the type for files made by Lotus Organizer, a proprietary tool
whose last stable release was in 2003 and which has been discontinued
since 2013.

We're probably not encountering that often, whereas Org-mode files are
relatively plentiful.

This also explicitly permits a few other file extensions that are likely to be
searched, bypassing MIME type checking.
  • Loading branch information
hrs committed Jun 7, 2023
1 parent bae7ca9 commit 7685e70
Showing 1 changed file with 14 additions and 1 deletion.
15 changes: 14 additions & 1 deletion corpus/document.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,14 @@ type Document struct {
norm float64
}

var explicitlyPermittedExtensions = map[string]bool{
"markdown": true,
"md": true,
"org": true,
"tex": true,
"txt": true,
}

func ParseDocument(path string, config *Config) (*Document, error) {
// Ensure that this is a text file
if !isTextFile(path) {
Expand Down Expand Up @@ -126,7 +134,12 @@ func (doc *Document) calcNorm() float64 {
}

func isTextFile(path string) bool {
// First, try to get the file's MIME type from its extension, if that's
// If the file's extension is explicitly permitted, just use that.
if permitted, _ := explicitlyPermittedExtensions[filepath.Ext(path)]; permitted {
return true
}

// Try to get the file's MIME type from its extension, if that's
// available.
mimeType := mime.TypeByExtension(filepath.Ext(path))

Expand Down

0 comments on commit 7685e70

Please sign in to comment.