정렬 알고리즘

정렬의 종류

Selection Sort
- repeatedly selecting the smallest (or largest, depending on the sorting order) element from the unsorted portion of the array and moving it to the beginning of the sorted portion. O(n^2)

Bubble Sort
- repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. O(n^2)

Insertion Sort
- It repeatedly takes an element from the unsorted portion of the array and inserts it into its correct position in the sorted portion of the array. The insertion sort algorithm is similar to how people sort playing cards in their hands.

Merge Sort
- a divide-and-conquer sorting algorithm that divides the input array into smaller subarrays, sorts each subarray recursively, and then merges the sorted subarrays to produce the final sorted array. O(nlogn)

Direct Access Array Sort
- if we are not limited to comparison operations, it is possible to beat the Ω(n log n) bound
- Construct a direct access array with size u and insert each item x into index x.key. Then simply read through the direct access array from left to right returning items as they are found
  - Inserting input Θ(n) + scanning the array Θ(u) time. (u is the biggest input possible)
- For the variation, we have Tuple sort, Radix sort, Counting sort.

트라이(Trie) : 자연어 처리/문자열 검색

트라이는 특정 문자열 집합의 접두사를 공유하는 방식으로 문자열 데이터를 저장하고 검색하는데 최적화되어 있습니다. 이 구조는 자동 완성, 스펠 체크, IP 라우팅(가장 긴 접두사 일치를 찾는 데 사용)과 같은 여러 애플리케이션에서 유용하게 사용됩니다.

* 대안으로 sql문의 like연산자를 사용가능 (select * from company where name like "keyword%")

노드 구성: 트라이의 각 노드는 자식 노드를 가리키는 포인터 배열을 가지고 있습니다. 배열의 인덱스는 특정 문자를 나타내고, 각 인덱스 위치의 값은 해당 문자로 시작하는 서브트리를 가리킵니다. 노드는 일반적으로 해당 노드가 문자열의 끝인지를 나타내는 플래그도 포함합니다.

장점:

문자열 검색 시간이 문자열의 길이에 비례하여 O(m)이 걸립니다(m은 문자열 길이).
해시 테이블과 달리, 문자열의 일부를 가지고 있는 키들도 빠르게 찾을 수 있어 접두사 검색에 매우 효율적입니다.

단점:

공간 복잡도가 높을 수 있습니다. 특히 알파벳의 문자가 많거나 저장된 문자열이 적은 경우 공간 낭비가 심할 수 있습니다.
접두사가 공유되지 않는 많은 양의 데이터를 저장하면 메모리 사용량이 크게 증가합니다.

환경설정

아파치의 Trie 라이브러리를 추가하여 구현한다.

implementation group : 'org.apache.commons', name :
       'commons-collection4', version: '4.3'

Trie 인스턴스를 빈으로 등록한다.

@Configuration
public class AppConfig {
    @Bean
    public Trie<String,String> tire(){
        return new PatriciaTrie<>();
    }
}

Trie사용법

trieInstance.put(key-word, value) : 단어를 character로 쪼개어 tree의 각 노드에 삽입한다. value는 leaf 노드에 부착함.

trieInstance.prefix(keyword) : keyword를 prefix로 갖는 모든 key-value map을 반환함.

trie.remove(keyword) : tree에 보관되어있는 keyword를 탐색하여 key-value pair를 삭제함.

Character Nodes: In a trie, each character of a key (typically a string) is stored in a separate node.

End Nodes: The value associated with a key is typically stored at the end node of that key.

'개발기술 > 알고리즘, 자료구조, PS' 카테고리의 다른 글

코딩테스트 탐색문제 (DFS,BFS,DP) (0)	2025.01.12
그래프이론 알고리즘, 자료구조 정리 (2)	2024.09.18
코딩테스트 문제후기 (3)	2024.05.21
코딩테스트 학습법 및 빈출유형 문제풀이 (0)	2024.01.21
자료구조 이론정리 (0)	2024.01.05

생각정리, 개발정리

정렬 알고리즘

정렬의 종류

트라이(Trie) : 자연어 처리/문자열 검색

'개발기술 > 알고리즘, 자료구조, PS' 카테고리의 다른 글

티스토리툴바

정렬 알고리즘

정렬의 종류

트라이(Trie) : 자연어 처리/문자열 검색

'개발기술 > 알고리즘, 자료구조, PS' 카테고리의 다른 글

'개발기술/알고리즘, 자료구조, PS' Related Articles

티스토리툴바