Welcome to the Power Users community on Codidact!

Power Users is a Q&A site for questions about the usage of computer software and hardware. We are still a small site and would like to grow, so please consider joining our community. We are looking forward to your questions and answers; they are the building blocks of a repository of knowledge we are building together.

Post History

81%

+7 −0

Q&A Determine encoding of text

I have some text files which think they are encoded in UTF-8: file test.txt test.txt: Unicode text, UTF-8 text, with CRLF line terminators However if I look at their content, I think they migh...

3 answers · posted 2y ago by samcarter‭ · edited 2mo ago by Andreas demands justice for humanity‭

Question character-encoding

#2: Post edited by

Andreas demands justice for humanity‭ · 2025-04-07T20:14:33Z (about 2 months ago)
Dead link; not really valuable addition to the question anyway.

Copy Link

Raw

Markdown

Determine encoding of text

~~I have some text files which think they are encoded in utf8:~~
```
file test.txt
test.txt: Unicode text, UTF-8 text, with CRLF line terminators
```
~~(https://github.com/samcarter/shared/blob/main/test.txt )~~
However if I look at their content, I think they might in reality have some other encoding:
```
ÒHi there. IÕm a test documentÓ
ÒTouchŽ.Ó
```
From context, this should read as
```
“Hi there. I'm a test document”
“Touché.”
```
How can I determine the original encoding of the text so that I can re-encode the file with `iconv` to hopefully get a readable text?

I have some text files which think they are encoded in UTF-8:
```
file test.txt
test.txt: Unicode text, UTF-8 text, with CRLF line terminators
```
However if I look at their content, I think they might in reality have some other encoding:
```
ÒHi there. IÕm a test documentÓ
ÒTouchŽ.Ó
```
From context, this should read as
```
“Hi there. I'm a test document”
“Touché.”
```
How can I determine the original encoding of the text so that I can re-encode the file with `iconv` to hopefully get a readable text?

#1: Initial revision by

samcarter‭ · 2023-08-24T12:45:40Z (almost 2 years ago)

Copy Link

Raw

Markdown

Determine encoding of text

I have some text files which think they are encoded in utf8:

```
file test.txt
test.txt: Unicode text, UTF-8 text, with CRLF line terminators
```

(https://github.com/samcarter/shared/blob/main/test.txt )

However if I look at their content, I think they might in reality have some other encoding:

```
ÒHi there. IÕm a test documentÓ

ÒTouchŽ.Ó 
```

From context, this should read as

```
“Hi there. I'm a test document”

“Touché.”
```

How can I determine the original encoding of the text so that I can re-encode the file with `iconv` to hopefully get a readable text?

character-encoding

Communities

Post History