Python은 긴 문자열을 잘라냅니다.

sourcecode

Python은 긴 문자열을 잘라냅니다.

copyscript 2023. 1. 20. 16:14

Python은 긴 문자열을 잘라냅니다.

Python에서 문자열을 75자로 잘라내는 방법은 무엇입니까?

JavaScript에서는 다음과 같이 처리됩니다.

var data="saddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddsaddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddsadddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd"
var info = (data.length > 75) ? data.substring[0,75] + '..' : data;

info = (data[:75] + '..') if len(data) > 75 else data

더욱 간결하게:

data = data[:75]

75자 미만일 경우 변경되지 않습니다.

더 짧게:

info = data[:75] + (data[75:] and '..')

Python 3.4+를 사용하는 경우 표준 라이브러리에서 다음을 사용할 수 있습니다.

지정된 텍스트를 축소하고 지정된 너비에 맞게 잘라냅니다.

먼저 텍스트의 공백은 축소됩니다(모든 공백은 단일 공백으로 대체됩니다).결과가 너비에 맞으면 반환됩니다.그렇지 않으면 나머지 단어와 자리 표시자가 너비 내에 들어갈 수 있도록 끝에서 충분한 단어가 삭제됩니다.
>>> textwrap.shorten("Hello  world!", width=12)
'Hello world!'
>>> textwrap.shorten("Hello  world!", width=11)
'Hello [...]'
>>> textwrap.shorten("Hello world", width=10, placeholder="...")
'Hello...'

Django 솔루션의 경우(질문에 언급되지 않음):

from django.utils.text import Truncator
value = Truncator(value).chars(75)

Truncator의 소스 코드를 보고 문제를 파악하십시오.https://github.com/django/django/blob/master/django/utils/text.py#L66

Django와의 절단에 대해서:장고 HTML 잘라내기

regex 포함:

re.sub(r'^(.{75}).*$', '\g<1>...', data)

긴 문자열은 잘립니다.

>>> data="11111111112222222222333333333344444444445555555555666666666677777777778888888888"
>>> re.sub(r'^(.{75}).*$', '\g<1>...', data)
'111111111122222222223333333333444444444455555555556666666666777777777788888...'

짧은 문자열은 잘리지 않습니다.

>>> data="11111111112222222222333333"
>>> re.sub(r'^(.{75}).*$', '\g<1>...', data)
'11111111112222222222333333'

이렇게 하면 문자열의 중간 부분을 "잘라"할 수도 있습니다.이것이 경우에 따라서는 더 좋습니다.

re.sub(r'^(.{5}).*(.{5})$', '\g<1>...\g<2>', data)

>>> data="11111111112222222222333333333344444444445555555555666666666677777777778888888888"
>>> re.sub(r'^(.{5}).*(.{5})$', '\g<1>...\g<2>', data)
'11111...88888'

limit = 75
info = data[:limit] + '..' * (len(data) > limit)

이것은 다음과 같습니다.

n = 8
s = '123'
print  s[:n-3] + (s[n-3:], '...')[len(s) > n]
s = '12345678'
print  s[:n-3] + (s[n-3:], '...')[len(s) > n]
s = '123456789'     
print  s[:n-3] + (s[n-3:], '...')[len(s) > n]
s = '123456789012345'
print  s[:n-3] + (s[n-3:], '...')[len(s) > n]

123
12345678
12345...
12345...

이 메서드는 다음과 같은 경우에는 사용하지 않습니다.

data[:75] + bool(data[75:]) * '..'

info = data[:min(len(data), 75)

Python 문자열은 동적으로 할당된 C 문자열처럼 실제로 "트렁크"할 수 없습니다.Python의 문자열은 불변입니다.다른 응답에 설명된 대로 문자열을 슬라이스하면 슬라이스 오프셋 및 스텝에 의해 정의된 문자만 포함하는 새 문자열이 생성됩니다.인터뷰 언어로 Python을 선택했을 때 인터뷰 진행자가 문자열에서 중복된 문자를 삭제하도록 요구하는 등 (비실용적인) 경우에 따라서는 조금 귀찮을 수 있습니다.도.

info = data[:75] + ('..' if len(data) > 75 else '')

또 다른 해결책입니다.와 함께True그리고.False마지막에는 시험에 대한 피드백을 받을 수 있습니다.

data = {True: data[:75] + '..', False: data}[len(data) > 75]

파티에 늦게 왔기 때문에 화이트 스페이스를 적절히 처리하는 문자 레벨의 트리밍 텍스트에 솔루션을 추가하고 싶습니다.

def trim_string(s: str, limit: int, ellipsis='…') -> str:
    s = s.strip()
    if len(s) > limit:
        return s[:limit-1].strip() + ellipsis
    return s

간단하지만, 그렇게 하면hello world와 함께limit=6추한 결과를 초래하지 않을 것이다hello …그렇지만hello…대신.

선행 및 후행 공백도 제거하지만 내부 공백은 제거하지 않습니다.내부 공간도 삭제하려면 이 스택오버플로우 투고를 체크해 주세요.

       >>> info = lambda data: len(data)>10 and data[:10]+'...' or data
       >>> info('sdfsdfsdfsdfsdfsdfsdfsdfsdfsdfsdf')
           'sdfsdfsdfs...'
       >>> info('sdfsdf')
           'sdfsdf'
       >>>

단순하고 짧은 도우미 기능:

def truncate_string(value, max_length=255, suffix='...'):
    string_value = str(value)
    string_truncated = string_value[:min(len(string_value), (max_length - len(suffix)))]
    suffix = (suffix if len(string_value) > max_length else '')
    return string_truncated+suffix

사용 예:

# Example 1 (default):

long_string = ""
for number in range(1, 1000): 
    long_string += str(number) + ','    

result = truncate_string(long_string)
print(result)


# Example 2 (custom length):

short_string = 'Hello world'
result = truncate_string(short_string, 8)
print(result) # > Hello... 


# Example 3 (not truncated):

short_string = 'Hello world'
result = truncate_string(short_string)
print(result) # > Hello world

정규 표현식은 필요 없지만 승인된 답변에서 문자열 연결 대신 문자열 형식을 사용하는 것이 좋습니다.

이 아마 입니다.data

>>> data = "saddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddsaddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddsadddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd"
>>> info = "{}..".format(data[:75]) if len(data) > 75 else data
>>> info
'111111111122222222223333333333444444444455555555556666666666777777777788888...'

여기 새로운 String 클래스의 일부로 만든 기능이 있습니다.서픽스를 추가할 수 있습니다(절대 사이즈를 강제할 필요는 없지만, 트리밍 후 문자열의 사이즈가 충분히 길면).

저는 몇 가지 사항을 바꾸는 중이었기 때문에 쓸모없는 논리 비용이 좀 듭니다.예를 들어, 더 이상 필요하지 않고 상단에 반환이 있는 경우...

하지만 여전히 데이터를 잘라내는 데 유용한 기능입니다.

##
## Truncate characters of a string after _len'nth char, if necessary... If _len is less than 0, don't truncate anything... Note: If you attach a suffix, and you enable absolute max length then the suffix length is subtracted from max length... Note: If the suffix length is longer than the output then no suffix is used...
##
## Usage: Where _text = 'Testing', _width = 4
##      _data = String.Truncate( _text, _width )                        == Test
##      _data = String.Truncate( _text, _width, '..', True )            == Te..
##
## Equivalent Alternates: Where _text = 'Testing', _width = 4
##      _data = String.SubStr( _text, 0, _width )                       == Test
##      _data = _text[  : _width ]                                      == Test
##      _data = ( _text )[  : _width ]                                  == Test
##
def Truncate( _text, _max_len = -1, _suffix = False, _absolute_max_len = True ):
    ## Length of the string we are considering for truncation
    _len            = len( _text )

    ## Whether or not we have to truncate
    _truncate       = ( False, True )[ _len > _max_len ]

    ## Note: If we don't need to truncate, there's no point in proceeding...
    if ( not _truncate ):
        return _text

    ## The suffix in string form
    _suffix_str     = ( '',  str( _suffix ) )[ _truncate and _suffix != False ]

    ## The suffix length
    _len_suffix     = len( _suffix_str )

    ## Whether or not we add the suffix
    _add_suffix     = ( False, True )[ _truncate and _suffix != False and _max_len > _len_suffix ]

    ## Suffix Offset
    _suffix_offset = _max_len - _len_suffix
    _suffix_offset  = ( _max_len, _suffix_offset )[ _add_suffix and _absolute_max_len != False and _suffix_offset > 0 ]

    ## The truncate point.... If not necessary, then length of string.. If necessary then the max length with or without subtracting the suffix length... Note: It may be easier ( less logic cost ) to simply add the suffix to the calculated point, then truncate - if point is negative then the suffix will be destroyed anyway.
    ## If we don't need to truncate, then the length is the length of the string.. If we do need to truncate, then the length depends on whether we add the suffix and offset the length of the suffix or not...
    _len_truncate   = ( _len, _max_len )[ _truncate ]
    _len_truncate   = ( _len_truncate, _max_len )[ _len_truncate <= _max_len ]

    ## If we add the suffix, add it... Suffix won't be added if the suffix is the same length as the text being output...
    if ( _add_suffix ):
        _text = _text[ 0 : _suffix_offset ] + _suffix_str + _text[ _suffix_offset: ]

    ## Return the text after truncating...
    return _text[ : _len_truncate ]

에서는 는는 here here here here here here here here here here를 씁니다.textwrap.shorten너비의 합니다.또한 이 단어가 최대 너비의 50%를 초과하는 경우 마지막 단어의 일부를 포함합니다.

import textwrap


def shorten(text: str, width=30, placeholder="..."):
    """Collapse and truncate the given text to fit in the given width.

    The text first has its whitespace collapsed. If it then fits in the *width*, it is returned as is.
    Otherwise, as many words as possible are joined and then the placeholder is appended.
    """
    if not text or not isinstance(text, str):
        return str(text)
    t = text.strip()
    if len(t) <= width:
        return t

    # textwrap.shorten also throws ValueError if placeholder too large for max width
    shorten_words = textwrap.shorten(t, width=width, placeholder=placeholder)

    # textwrap.shorten doesn't split words, so if the text contains a long word without spaces, the result may be too short without this word.
    # Here we use a diffrent way to include the start of this word in case shorten_words is less than 50% of `width`
    if len(shorten_words) - len(placeholder) < (width - len(placeholder)) * 0.5:
        return t[:width - len(placeholder)].strip() + placeholder
    return shorten_words

테스트:

>>> shorten("123 456", width=7, placeholder="...")
'123 456'
>>> shorten("1 23 45 678 9", width=12, placeholder="...")
'1 23 45...'
>>> shorten("1 23 45 678 9", width=10, placeholder="...")
'1 23 45...'
>>> shorten("01 23456789", width=10, placeholder="...")
'01 2345...'
>>> shorten("012 3 45678901234567", width=17, placeholder="...")
'012 3 45678901...'
>>> shorten("1 23 45 678 9", width=9, placeholder="...")
'1 23...'
>>> shorten("1 23456", width=5, placeholder="...")
'1...'
>>> shorten("123 456", width=5, placeholder="...")
'12...'
>>> shorten("123 456", width=6, placeholder="...")
'123...'
>>> shorten("12 3456789", width=9, placeholder="...")
'12 345...'
>>> shorten("   12 3456789    ", width=9, placeholder="...")
'12 345...'
>>> shorten('123 45', width=4, placeholder="...")
'1...'
>>> shorten('123 45', width=3, placeholder="...")
'...'
>>> shorten("123456", width=3, placeholder="...")
'...'
>>> shorten([1], width=9, placeholder="...")
'[1]'
>>> shorten(None, width=5, placeholder="...")
'None'
>>> shorten("", width=9, placeholder="...")
''

「」라고 합니다.stryng과 자르고 입니다.nchars는 출력 문자열에서 필요한 문자 수입니다.

stryng = "sadddddddddddddddddddddddddddddddddddddddddddddddddd"
nchars = 10

다음과 같이 문자열을 잘라낼 수 있습니다.

def truncate(stryng:str, nchars:int):
    return (stryng[:nchars - 6] + " [...]")[:min(len(stryng), nchars)]

특정 테스트 케이스에 대한 결과는 다음과 같습니다.

s = "sadddddddddddddddddddddddddddddd!"
s = "sa" + 30*"d" + "!"

truncate(s, 2)                ==  sa
truncate(s, 4)                ==  sadd
truncate(s, 10)               ==  sadd [...]
truncate(s, len(s)//2)        ==  sadddddddd [...]

제 솔루션은 위의 테스트 사례에 대한 합리적인 결과를 도출합니다.

다만, 몇개의 병적인 케이스는 다음과 같습니다.

일부 병적인 사례!

truncate(s, len(s) - 3)()       ==  sadddddddddddddddddddddd [...]
truncate(s, len(s) - 2)()       ==  saddddddddddddddddddddddd [...]
truncate(s, len(s) - 1)()       ==  sadddddddddddddddddddddddd [...]
truncate(s, len(s) + 0)()       ==  saddddddddddddddddddddddddd [...]
truncate(s, len(s) + 1)()       ==  sadddddddddddddddddddddddddd [...
truncate(s, len(s) + 2)()       ==  saddddddddddddddddddddddddddd [..
truncate(s, len(s) + 3)()       ==  sadddddddddddddddddddddddddddd [.
truncate(s, len(s) + 4)()       ==  saddddddddddddddddddddddddddddd [
truncate(s, len(s) + 5)()       ==  sadddddddddddddddddddddddddddddd 
truncate(s, len(s) + 6)()       ==  sadddddddddddddddddddddddddddddd!
truncate(s, len(s) + 7)()       ==  sadddddddddddddddddddddddddddddd!
truncate(s, 9999)()             ==  sadddddddddddddddddddddddddddddd!

특히,

행 문자가 있는 ( 「」「」「」「」\n가 있을 수 에 문제가 있을 수 있습니다.
nchars > len(s), 그럼 줄을 요.s하려고 하지[...]

다음은 기타 코드입니다.

import io

class truncate:
    """
        Example of Code Which Uses truncate:
        ```
            s = "\r<class\n 'builtin_function_or_method'>"
            s = truncate(s, 10)()
            print(s)
                    ```
                Examples of Inputs and Outputs:
                        truncate(s, 2)()   ==  \r
                        truncate(s, 4)()   ==  \r<c
                        truncate(s, 10)()  ==  \r<c [...]
                        truncate(s, 20)()  ==  \r<class\n 'bu [...]
                        truncate(s, 999)() ==  \r<class\n 'builtin_function_or_method'>
                    ```
                Other Notes:
                    Returns a modified copy of string input
                    Does not modify the original string
            """
    def __init__(self, x_stryng: str, x_nchars: int) -> str:
        """
        This initializer mostly exists to sanitize function inputs
        """
        try:
            stryng = repr("".join(str(ch) for ch in x_stryng))[1:-1]
            nchars = int(str(x_nchars))
        except BaseException as exc:
            invalid_stryng =  str(x_stryng)
            invalid_stryng_truncated = repr(type(self)(invalid_stryng, 20)())

            invalid_x_nchars = str(x_nchars)
            invalid_x_nchars_truncated = repr(type(self)(invalid_x_nchars, 20)())

            strm = io.StringIO()
            print("Invalid Function Inputs", file=strm)
            print(type(self).__name__, "(",
                  invalid_stryng_truncated,
                  ", ",
                  invalid_x_nchars_truncated, ")", sep="", file=strm)
            msg = strm.getvalue()

            raise ValueError(msg) from None

        self._stryng = stryng
        self._nchars = nchars

    def __call__(self) -> str:
        stryng = self._stryng
        nchars = self._nchars
        return (stryng[:nchars - 6] + " [...]")[:min(len(stryng), nchars)]

언급URL : https://stackoverflow.com/questions/2872512/python-truncate-a-long-string

'sourcecode' 카테고리의 다른 글

Composer 업데이트 메모리 제한 (0)	2023.01.20
ubuntu 서버 16.04의 mysql 기본 비밀번호 (0)	2023.01.20
Panda의 지도, 적용 지도, 적용 방법의 차이 (0)	2023.01.15
JavaScript는 싱글 스레드화가 보장됩니까? (0)	2023.01.15
베이스 테이블 또는 뷰를 찾을 수 없음: 1146 테이블 Larabel 5 (0)	2023.01.15

현재글Python은 긴 문자열을 잘라냅니다.

각종 프로그래밍 정보를 다루는 블로그입니다.

jquery, reactjs, Git, angularjs, JSON, WordPress, C, JavaScript, MariaDB, Python, php, oracle, Spring-Boot, Ajax, mongoDB, Java, Powershell, vuejs2, vuex, MySQL,

Today :
Yesterday :

copyscript

Python은 긴 문자열을 잘라냅니다.

Python은 긴 문자열을 잘라냅니다.

일부 병적인 사례!

'sourcecode' 카테고리의 다른 글

'sourcecode'의 다른글

티스토리툴바

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Python은 긴 문자열을 잘라냅니다.

Python은 긴 문자열을 잘라냅니다.

일부 병적인 사례!

'sourcecode' 카테고리의 다른 글

'sourcecode'의 다른글

관련글

티스토리툴바